Giter VIP home page Giter VIP logo

twittervane's Introduction

No longer under active development

Twittervane

Twittervane is a prototype application capable of collecting and analysing Twitter feeds and outputs URLs mentioned in the Tweets. These URLs shared on the Twitter could potentially point to web resources relevant to web archive collections.

Evaluation

The Evaluating Twittervane project was funded by the International Internet Preservation Consortium (IIPC) to build on an earlier project, Twittervane.

Six curators of the National Library of New Zealand, the National Library of France and the Library of Congress independently evaluated the Twittervane methodology and provided their feedback. Curators had 3 weeks to use and test Twittervane. They not only provided valuable feedback on the user interface and documentation, but also set up collections and assessed the relevance of the URLs reported by Twittervane for their collections. Some feedback, where possible within the project’s resource, was addressed while others have been logged as future requirements.

The general view is that Twittervane could be useful for events-based collections, as it could reduce the time spent on web searching especially over a longer period of time (e.g. elections, Olympics). URLs reported by Twittervane tend to point to news sites and online periodicals. However, curators also found that only a small percentage of the URLs found by Twittervane are relevant and can be accepted as valid selections (eg 20% ~ 30%). Many URLs lead to spam sites.

Issues & lessons learned

One curator pointed out that search terms are closely related to and impact the quality of the results produced by Twittervane. Unfortunately the project team wasn’t much more experienced than the curators to provide more useful hints. Basic training including best practice about the use of search terms to obtain the most relevant tweets, seems an helpful area of future work.

The relevance and quality of the URLs expanded by Twittervane seem to raise the question whether they can justify the amount of processing required to produce the URLs. This may not only be related to the search terms used, but also to the nature of social networks like Twitter, that this approach may only be useful for very specific collections.

Conclusions & recommendations

Most curators who took part in the evaluation were positive about the Twittervane approach and saw this as a complementary selection tool, especially for events-based collections. However, Twittervane also points to a large number of URLs which are not relevant to the collections and cannot be used as valid selections (e.g. spam sites and duplicates). This may be improved when curators are more skilled and establish best practice in using the most appropriate search terms for a collection. More testing is required over longer period of time would be needed in order to determine this. The issues related to data quality may also be addressed technically by for example removing duplicates and detecting spam sites but further investigations are required to achieve this.

Twittervane is not a replacement of the curatorial process but has the potential to be a complementary tool, which may only be useful for events-based collections.

Further work need to take place to productionise Twittervane. However the question that needs to be answered first is whether the amount of processing required to produce the small amount of relevant URLs can be justified.

Note also that there are third-party services like Topsy that fulfil a very similar role.

Other tools

twittervane's People

Contributors

anjackson avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.