Thoughtfully designing services and rigorously testing software to support personal information management (PIM) requires understanding the relevant collections, but relatively little is known about what people keep in their file collections, especially personal collections. Complementing recent work on the structure of 348 file collections, we examine those collections' contents, how much content is duplicated, and how collections used for personal matters differ from those used for study and work. Though all collections contain many images, some intuitively common file types are surprisingly scarce. Personal collections contain more audio than others, knowledge workers' collections contain more text documents but far fewer folders, and IT collections exhibit unusual traits. Collection duplication is correlated to collections' structural traits, but surprisingly, not to collection age. We discuss our findings in light of prior works and provide implications for various kinds of information research.
翻译:暂无翻译