About | Project Members | Research Assistants | Contact | Posting FAQ | Credits

Collex

Summary:
Collex is a tool developed at the University of Virginia’s Applied Research in Patacriticism lab (ARP) and currently operated in conjunction with NINES (Networked Interface for Nineteenth-century Electronic Scholarship). Described as an “interpretive hub,” (Nowviskie) Collex acts as an interface for nine different peer-reviewed, scholarly databases. The interface allows users to access all nine databases in one search, while results retain the unique characteristics of each individual source. Additionally, users can create exhibits for their own personal use, or they may submit exhibits to be shared with all users. As such, Collex and its relationship to data evolves as users interact with it, relying on folksonomy and user-generated relationships to construct new ways of viewing the information it contains therein.

Description:
In September 2005, an international group of digital scholars held a “Tools Summit” at the University of Virginia. One of the fundamental goals of this conference was to address the following issue: “the general field of humanities education and scholarship will not take the use of digital technology seriously until one demonstrates how its tools improve the ways we explore and explain our cultural inheritance — until, that is, they expand our interpretational procedures” (McGann, quoted in Nowviskie 2) Part of the NINES project and developed by the ARP lab at UVA, Collex is geared toward just such an expansion of interpretation. The fundamental philosophy behind Collex moves away from centrally organized, hierarchical content and away from a data structure in which the complex relationships of an archive are only apparent to those who are intimately familiar with the software platform.

To decentralize content, Collex relies on a federated model of content repository in which various NINES-approved archives and journals are included under the auspices of one search engine. When users access an object (objects range from essays, to artwork, to entire scanned books, to single poems within a scanned book), they are taken to the original online source; thus, Collex retains the unique interpretive and presentational framework of each individual archive or journal. To make the relationships between objects more explicit, Collex utilizes the Resource Description Framework data model (RDF). (A component of the Semantic Web, RDF provides a method for stating in machine-processeable fashion the logical relations [“properties”] between entities [“resources”], so that, for example, the entities “Mary Shelley” and “The Last Man” can be stated as the relation “author of.” The syntax of RDF is commonly implemented in XML.) Faceted classification, “a non-hierarchical means of expressing ontological relationships” (Nowviskie 9), and objects labeled “more like this” allow the emergence of relationships between objects that users may not have anticipated. The RDF structure also allows users to add their own descriptive tags to objects and build “COllections.” Future releases will allow users to build EXhibits of material to share with others (COllections + EXhibits = Collex). In addition to re-visioning the way in which scholarly resources are organized and accessed, Collex “translate[s] the products of user interaction into RDF objects within the system itself” (Nowviskie 9). Bethany Nowviskie, one of the original designers of The Rossetti Archive and a primary developer on the Collex project, reflects, “new technologies in open archives, modeling, and semantic analysis began to suggest a research collective not organized around passive access to online resources, but rather sensitive to the uses to which those resources are put, to the ways in which they are continually re-interpreted” (5).

In this way, each user becomes an important part of Collex’s structure. The whitepaper for the tool states, “we are now persuaded that NINES must be understood as a social system” (NINES 7) [1]. The developers acknowledge that Collex’s folksonomical characteristics only take on interpretive importance as the community of users develops and collections and exhibits are shared. To that end, Collex 1.0 was announced to humanities-based researchers in August 2006. The current version offers three different ways to search the NINES resources with users able to tag and collect objects from among the search results. Version 2.0, currently slated for a December 2006 release, will employ a function in which users can curate specialized online exhibits. Exhibits can be built and shared among users, with some peer-reviewed exhibits becoming part of the general NINES resources.

Currently, when users first log in to Collex (no login is required for searching, but is necessary for tagging, etc.), they are presented with the following screen:

Users may access content via the cloud visualization to the left or by entering search parameters to the right. The cloud visualizations allow scholars to ascertain at a glance what type of information is available. Larger font sizes indicate more content in a given category:

Once an object is selected from a cloud, users are presented with the various tags associated with the object, as well as additional similar objects. The “more like this” function allows researchers to engage in “serendipitous” searching in which relationships emerge that may not have been initially apparent. From here, users may go to the original archive to view the object, or continue mining via the list of tags or “more like this” function.

Users may also add tags and annotations and “collect” objects. For instance, adding the tag “mother” to Dante Gabriel Rossetti’s The Seed of David saves the tag and also adds the object to a personalized collection. Researchers now have the choice between cloud visualizations of the entire site or visualizations of their own collections:

If users prefer a more structured search, they may set parameters using phrases, genre, source, etc. For example, a search on the phrase “media” yields results with the term in any field, which the user can then add to her collection.

In addition to using the Collex search interface, users may search each individual archive or journal and use a Collex bookmarklet to add items to their collections.

Although Collex is being developed by NINES, it is the hope of the development team that as the tool is improved, it will be deployed in other contexts.

Research Context:
Collex is of interest to a variety of fields. Cognitive psychologists may be interested in the way that the tool facilitates metacognitive awareness and networks of representations in the research process. In addition, those interested in Semantic Web technologies will find Collex, particularly its forthcoming exhibit functions and its integration of user interactions, of value. Finally, researchers in the field of social networking may find Collex valuable in relation to its role in facilitating a networked community of scholars.

Technical Analysis:
Content
The Content of NINES is all peer-reviewed and currently draws from nine scholarly archives or journals:

As NINES continues to develop, content will come from four sources, with contributors responsible for creating NINES-ready RDF. NINES provides stylesheets and examples to assist contributors in ensuring their objects are NINES-ready.

  1. User-created material from Collex: most user-generated material will be housed in a non-peer reviewed section of the site. Users may submit material for peer-review to be included in the “sanctioned set of NINES material” (NINES). This is the one area in which contributors are not responsible for RDF; Collex automatically generates the metadata for collections and exhibits.
  2. Federated scholarly archives: each archive decides which objects will be included in NINES.
  3. Resource records from NINES-approved institutions: this may include museums, libraries, etc. Contributors may not have discrete web pages for each object and may instead provide accession numbers and a link to the institution’s home page.
  4. Contributions to NINES-affiliated journals: these may be print, online, or both.

RDF
Collex employs XML-based RDF according to the standards of the Dublin Core Metadata Initiative and the Open Archives Initiative. Each object is initially categorized as a primary or secondary object and is then given the basic array of facets: author, title, date, and genre. Genre facets are based upon the Cambridge Bibliography of English Literature.

Search Engine

Collex utilizes Solr, a Lucene-based search engine.


Evaluation of Opportunities/Limitations for the Transliteracies Topic:

Collex is a unique and exciting tool and may be of great importance to Transliteracies, both as a model and as a potential partnering opportunity. NINES’ use of federated content and recursive user interactions highlights the unique benefits that online research has to offer. This approach, in which the developers look beyond trying to mimic traditional research models and instead aim to emphasize the unique capabilities of online research, is fundamental to the development of a Transliteracies reading tool. The benefits of doing research with Collex, should it be deployed into multiple contexts, is likely to entice very traditional scholars to engage in more robust online research activities, thus increasing the scholarly audience who may have need of a Transliteracies tool.

While Collex enhances the online research experience, its utility is limited in terms of the actual reading process. Once users access objects from the NINES content, they are still faced with the challenges of applying traditional print paradigms of reading to this digitized object. Transliteracies may benefit from trying to integrate Collex, or at least Collex-based features, into any eventual tool developed for online reading. A powerful reading tool partnered with Collex could indeed improve the entire research process for countless scholars.

Notes:

[1] Currently, Collex is the only project of NINES and is considered the fundamental organizing principle of the group. At this juncture, one might speak of them interchangeably.

Resources for Further Study:

Points for Expansion: Related Objects for Study:

  tl, 09.12.06

4 Responses to Collex

  1. erikhatcher says:

    Kim – thanks so much for such a thorough review of Collex. I am currently the lead developer on the project. One minor technical correction – Collex uses Lucene via Solr. It currently does not use Nutch (though we have experimented with it for crawling archives eventually).

  2. Kimberly Knight says:

    Erik – Thank you for the correction; I’ve adjusted the report accordingly. I am really looking forward to watching Collex evolve! – Kim

  3. Collex » NINES and Collex “à la loupe” says:

    [...] linked this review, along with past ones by AI3 and Transliteracies in this blog’s sidebar, and would be grateful to know of any others floating around out [...]

  4. Collex collection | Zunoinc says:

    [...] Transliteracies » Blog Archive » CollexSep 12, 2006 … Future releases will allow users to build EXhibits of material to share with others ( COllections + EXhibits = Collex). In addition to re-visioning the … [...]