You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

February 24 LD4L Workshop breakout session: Usage Data

facilitator: Paul Deschner

  1. Usage data sources

    1. OCR-ed bibliographies and page rank

    2. ILL usage

    3. Yahoo circ logs

    4. Web analytics (e.g., DPLA UI analytics, esp. contextual granularity)

    5. Search terms as form of usage; also as compared to other usage data

    6. Entities extracted from queries, not simply literal queries themselves

    7. How often a link is traversed; how many times your link has been reconciled in triple store

    8. Browsed materials

    9. Citations; also citation networks as compared to other usage data

    10. Course-book lists across institutions

  2. StackScore

    1. Makes data muddy

    2. Too many metrics mixed together; need to separate out the metrics

    3. Common metrics needed across institutions

    4. Computational transparency important: metrics and algorithms

  3. Negative usage data at local institution

    1. Important to see what users are looking for but local institution doesn’t have

    2. What doesn’t circulate in-house but is available via ILL

    3. What isn’t read at Columbia but at Yale

  4. Usage data runs risk of becoming prescriptive

    1. Blandness of collections when everyone acquires most popular items

  5. Use cases

    1. Keeping tabs on popularity of colleagues’ publications

    2. Usage data as diagnostic tool for targeted collections: highly invested-in parts of collection not being used could drive arranging an exhibition to increase awareness

    3. Scholars doing research on other scholars research and publications

    4. Look at when items were used: what was checked out in last week, month, year, etc.

    5. Link traversals and other link metrics could be sent to link’s source

  6. Long tail issue generally and at own institution

    1. Options: random selection out of tail for exposure, subject-filtered selection

    2. Important that UI expose long-tail possibilities prominently, above the page-fold

    3. Usage data from other institutions and ILL balances out local-institution’s biases

  7. Privacy

    1. Opt-in option for users willing to share their usage data

    2. Huddersfield University (England): more liberal approach to data exposure, including access to clustering (users who borrowed this also borrowed that) and usage by academic course and school

    3. IP-based web stats inherently less risky than personal ID-based circulation data

    4. Anonymization tools important

    5. Clustering dangerous
  • No labels