Services on linked data

LD4L Workshop Breakout Session, Tuesday, February 24

Risk of not knowing what to search for

may be addressed by
risk of not knowing what to search for
publish starting points & examples of queries and/or canned responses
reconciliation services — not necessarily monopolies or centralized
iterative, with curation and provenance
common API for reconciliation building on the OpenRefine API — specify as much metadata as you have, get ranked results back
mashup tools that test connections
sameAs website
validation
RDF data shapes
DMCI RDF validation
extension mechanisms - Schema.org
query on different axes — query OCLC by VIAF id to get works
ability to push bookmarks but as small graphs of data, consumable by others
semantic web crawling
bookmark
a service where I can push the results of my search, organized by topic
a sort of Mendele but for everything
add it to a collection I have 
similar to an annotation service
you search, you refine it, you step back — now only save as bookmarks at one level
nobody can use your bookmarks
2
a tool that would facilitate entity reconciliation
to put together UN and LC
a first pass, then improve that manually, then 2nd iteration
then publish — surface
manage difference of opinion
provenance
exclude some
centralized entity mapping
feedback by users on the mapping
need protocols
want to discover annotation — known servers with protocols 
collections have been done by many different places
if we do linked data, my list is a list of URIs from many sources
on the UI won’t see that
assuming accessible SPARQL endpoints
3
other cleanup tasks —  validation? consistency of ontology use
entity recognition — text mining or analytics for tools — autotaggers
4
constant crawling graphs of linked data
semantically aware web crawling — is it worth going down this path, what’s attached, what has changed
5
provenance space — who’s made a particular assertion for that
in the library domain, could imagine a layer about who’s responsible for an assertion
unspecified.
crowd sourcing — as move up toward the general public, typically track less who did it
variable credibility
acknowledge that
nanopublications
===== group 4 ====
reconciliation services — contains no data, queries a distributed set of resources
individual libraries will become the authorities for special collections — items, people, events
queries to a central area would find a match
cache the sameAs so don’t have to re-query
everybody who consumes has the cross-links
the sort of thing that OCLC might end up doing — 
could be any type of object — logical to start with works 
brings up the questions of the degrees of sameAs ness
when a new match is known, publish that — a notification mechanism
you would provenance those links to indicate where came from
used to be a plug-in for Netscape where a side-wiki and annotate — anybody could see what everyone else had done
now in the world of unique identifiers — a linkerator - for people to rank what they see
build up ant trails over time, around an object
how to make it in any way central — get it to the browser
how about the annotation example?
regular expressions against EAD for an object to suggest what they link to
feed into a system to validate
then give pointers to the link
other levels of relationship than sameAs
over time it would aggregate and 
a clustering algorithm — the more a link is traversed, the space reduces
emergence sorting
software crawling the graph - how do you figure out what to trust? the world according to professor X or Y
trust is very tricky
a page rank algorithm for linked data — more for asserters
strenghthen the nodes to repeat confidence
repeating assertions in multiple repositories — I agree with them, the +1 or thumbs up
Reddit gets a lot of traction
nanopublications
if you reify assertions — to add confidence where have more knowledge or curation
confidence levels
wikipedia has a way to accept 
no confidence in semantic search engines
too siloed
visualizations have to be crafted