Video and updates from ONA15 session: Whose Idea of the Future Is This?

Inspired by Afrofuturism and other underrepresented perspectives, I organized a session at this year’s Online News Association 2015 conference in Los Angeles with an awesome group of speakers:

We’ve assembled a group of experts on futurism to look at predictions and possibilities for how our society is changing, and help rethink our approach to media, technology and our communities.

Here’s the session page. Here’s the Storify:

Continue reading Video and updates from ONA15 session: Whose Idea of the Future Is This?

Job news: I’m joining McClatchy DC as a data developer

After nearly five years at The Washington Post, I’m thrilled to start a new job in late September focused on data stories, projects and tools at McClatchy’s DC bureau.

I’ve loved working in the Post newsroom with such fantastic, inspiring coworkers. From starting on health/science and then world/national security production to building news apps/tools and managing local data projects to now producing on Team Rainbow, it’s been an invaluable and rewarding experience.

At McClatchy, this new role offers an unique opportunity to collaborate with their formidable DC bureau and across their 29 news organizations, plus sit near former Post colleagues who are part of an impressive video team. It’s also a chance to work again with my hometown paper The Miami Herald, where I freelanced and interned during college. I can’t wait to join all the talented journalists at McClatchy.

Here’s the very kind announcement from my new boss, Julie Moos:

All,

I’m excited to announce that Greg Linch will be joining us late this month to help plan, produce and launch data-driven projects coming out of our local newsrooms and out of DC.

Greg’s arrival enables us to broaden and deepen our data efforts, which you’ll be hearing more about in coming weeks. To start, we plan to provide do-it-yourself tools and a range of support for the data storytelling that’s becoming so essential to readers everywhere across a range of subjects.

Greg joins us from The Washington Post, where he currently works on Project Rainbow (the tablet team); his previous roles there include local data editor, news apps producer and national security producer. He has FOIAed and negotiated with local agencies to publish their daily crime data or weekly crime reports; led work on voter’s guides and results pages for primary and general elections; developed systems for handling documents (like the Clinton emails) and email newsletters; and worked on many projects that required reporting skills as strong as his coding skills.

Greg is a member of the board of directors of the Online News Association, co-organizer of the DC Hacks/Hackers chapter and an all-around great journalist capable of elevating our work in interesting ways. Here’s his resume.

Greg will be based at Tish’s old desk, as he fills the position opened by her departure. His professional career started at The Miami Herald and we are happy to lure him back to McClatchy, starting Sept. 29.

Thanks for joining me in making him feel welcome.

Julie

Johanna Drucker on data vs. capta

Johanna Drucker in Humanities Approaches to Graphical Display:

Capta is “taken” actively while data is assumed to be a “given” able to be recorded and observed. From this distinction, a world of differences arises. Humanistic inquiry acknowledges the situated, partial, and constitutive character of knowledge production, the recognition that knowledge is constructed, taken, not simply given as a natural representation of pre-existing fact.

Also, in her paper on Graphesis: Visual knowledge production and representation:

Data are considered objective “information” while capta is information that is captured because it conforms to the rules and hypothesis set for the experiment.

Hat tip to Mark Hansen when he mentioned the former at #NICAR14. And hat tip to Tim Carmody for first introducing me to Drucker when he recommended The Visible Word.

Highlights from #cj2014 opening keynote: Jon Kleinberg

I’m following the Computation + Journalism 2014 symposium via the hashtag and livestream. Below are some highlights I collected from the opening keynote.

#cj2014: Tracing the Flow of On-Line Information through Networks and Text

Keynote by Jon Kleinberg at 2014 Computation + Journalism symposium at Columbia University

  1. Event page:
  2. Highlights from the keynote (in chronological order):
  3. Keynote by Jon Kleinberg of Cornell: metaphors of information travelling online include the library and the crowd #cj2014
  4. #Information travels on-line via #library (pages, links, association) & crowd (memes, contagion) | #data #CJ2014
  5. Jon Kleinberg opens #CJ2014 with a ref to the classic essay As We May Think.  http://j.mp/ZPWaO1 
  6. Jon Kleinberg, speaking right now at #cj2014, did some really cool work tracking chain letters online in 2008  http://www.pnas.org/content/105/12/4633.full 
  7. We can track the flow of information temporally, structurally, and in terms of content, says Jon Kleinberg #cj2014
  8. But are crowd & library metaphors dual: people trailblazing through documents or documents transmitted through networks of people? #cj2014
  9. It’s easier for algorithms to track items (quotes, photos, phrases) than stories. Q: Does that encourage pack journalism? #CJ2014
  10. Tracking stories through networks reveals difficulties eg., natural language. But can track quotes to show news cycles #CJ2014
  11. Kleinberg explains tracking essential elements of a story (like phrases) as they move through networks. #cj2014 http://t.co/V1fiFZWUBS

    Kleinberg explains tracking essential elements of a story (like phrases) as they move through networks. #cj2014 pic.twitter.com/V1fiFZWUBS
  12. Half of all reshares on FB happen in large cascades (>500) | #paradox #viral #CJ2014
  13. Basic question: how to predict what content will be shared widely? Or, are cascades unpredictable? #cj2014 http://t.co/Q7dCleEkXH

    Basic question: how to predict what content will be shared widely? Or, are cascades unpredictable? #cj2014 pic.twitter.com/Q7dCleEkXH
  14. #cj2014 Is virality predictable? You as poster rarely experience it w your content, but you as consumer see it often http://t.co/IEgOmZtWIv

    #cj2014 Is virality predictable? You as poster rarely experience it w your content, but you as consumer see it often pic.twitter.com/IEgOmZtWIv
  15. One solution: reframe question as tracking rather than snapshot instant: what are the chances of this being shared further? #cj2014
  16. On whether something “goes viral”: “An important moment in a cascade is the moment it escapes the neighborhood of the root.” #cj2014
  17. Temporal features most powerful in predicting resharing of photo memes #CJ2014 http://t.co/3ZKFHIzO7Y

    Temporal features most powerful in predicting resharing of photo memes #CJ2014 pic.twitter.com/3ZKFHIzO7Y
  18. My thoughts are on how narratives or stories in news, eg images of ‘typical’ migrants, circulate and are widely diffused #cj2014
  19. Troubling finding here seems to be that actual content has less impact on how likely something is to go viral #cj2014 http://t.co/lver1zx14e

    Troubling finding here seems to be that actual content has less impact on how likely something is to go viral #cj2014 pic.twitter.com/lver1zx14e
  20. Research to understand discussion and comment threads - #cj2014 keynote by Jon Kleinberg http://t.co/3HUQi1uZj1

    Research to understand discussion and comment threads – #cj2014 keynote by Jon Kleinberg pic.twitter.com/3HUQi1uZj1
  21. Kleinberg now moving from global discussion to local conversations via threads or friends. What makes them engaging, long, short? #cj2014
  22. Tracking the virality of memes: Speed is important. Pics that get the first 1k of shares fast are more likely to go viral after. #cj2014
  23. Content more likely to spread if strangers share it = good reason for journalists to make sure their networks are diverse #CJ2014
  24. #visualization shows 2 kinds of threads: long due to many contributors posting once or convo among few ppl #cj2014 http://t.co/Js2wFv0lyy

    #visualization shows 2 kinds of threads: long due to many contributors posting once or convo among few ppl #cj2014 pic.twitter.com/Js2wFv0lyy
  25. Super interesting question!: why do certain quotes/content stand out? Linguistic markers? #visualization #cj2014 http://t.co/1muOY6tZxI

    Super interesting question!: why do certain quotes/content stand out? Linguistic markers? #visualization #cj2014 pic.twitter.com/1muOY6tZxI
  26. For a week in September 2008, Obama commandeered the news media with the line “lipstick on a pig,” says Jon Kleinberg #cj2014
  27. That would be a nice job description for a business card: Meme tracker. #cj2014
  28. Kleinberg compares memorable & unmemorable movie lines as lab setting to see what features contribute to memorable or viral text #CJ2014
  29. How to track virality of content - use movie quotes: "These aren't the droids you're looking for." #cj2014 http://t.co/Z1YqXGlsgM

    How to track virality of content – use movie quotes: “These aren’t the droids you’re looking for.” #cj2014 pic.twitter.com/Z1YqXGlsgM
  30. Why do we like “these aren’t the droids you’re looking for” but not “you don’t need to see his identification” #CJ2014
  31. Memorable quotes are sequences of unusual words with common part of speech patterns #cj2014 – application to headline writing?
  32. Memorable quotes are less probable in their word choices but more probably in their sentence (part-of-speech) structure – Kleinberg. #cj2014
  33. Jon Kleinberg: Socially shared information - how to predict success stories? Try a sequence of unusual words.#cj2014 http://t.co/AVzW3vImS6

    Jon Kleinberg: Socially shared information – how to predict success stories? Try a sequence of unusual words.#cj2014 pic.twitter.com/AVzW3vImS6
  34. Is there an algorithmic pattern to why a movie quote is memorable? Take “you had me at hello.” What’s so special about it? #cj2014
  35. “Memorable quotes need to have a certain portability” _Jon Kleinberg #cj2014
  36. Memorable quotes tend to be more ‘general’: more present tense, indefinite articles, fewer third-person pronouns >> ‘portability’ #cj2014
  37. #CJ2014 The ‘You had me at hello’ paper reference by Jon Kleinberg (including movie quotes memorability test):  http://www.mpi-sws.org/~cristian/memorability.html 
  38. Slogans in #advertising are like memorable quotes. “It just keeps going & going & going.” | #marketing #NLP #CJ2014
  39. Is there an analogy of genetics for text: ‘fitness’ of text for sharing, mutation of ‘junk’ parts of quotes while core parts remain #cj2014
  40. #cj2014 Just as genes have functional parts and junk parts, so does text - Beautiful analysis of content prolongation http://t.co/oFsLnMmrN7

    #cj2014 Just as genes have functional parts and junk parts, so does text – Beautiful analysis of content prolongation pic.twitter.com/oFsLnMmrN7
  41. “Genetic analogies for memes are becoming increasingly rich” -Jon Kleingberg #cj2014
  42. Sharing on social networks: “Can cascades be predicted?” — paper by Jon Kleinberg et al  http://bit.ly/1nCkspI  #cj2014
  43. Kleinberg wraps up his fascinating talk with new avenues for computational insight into info flows #CJ2014 http://t.co/vTloP7pllJ

    Kleinberg wraps up his fascinating talk with new avenues for computational insight into info flows #CJ2014 pic.twitter.com/vTloP7pllJ
  44. Great question: What are the features of content that make people STOP watching/reading/commenting? #CJ2014
  45. Another great question: Are there computational ways to evaluate WHO gets to be quoted in the first place? #CJ2014

Blockchains for News

Anil Dash’s piece on applying an underlying concept of Bitcoin to track digital art has me thinking about the potential applications of  blockchains for news. As he writes:

What the technology behind Bitcoin enables, in short, is the ability to track online trading of a digital object, without relying on any one central authority, by using the block chain as the ledger of transactions.

What if we built a blockchain system for news? Recording and verifying facts, data, updates, quotes, people, etc like the Bitcoin protocol tracks transactions in a database that no one owns, but of which everyone always has the same copy. (Update: This is meant more as “inspired by blockchains,” but it would be different kind of system because we’re not dealing with transferring or owning the units.)

How useful would that be in the reporting and dissemination of information? With all the noise introduced during breaking news and even long, complex story arcs, it seems like there’s a lot of potential here.

The nature and task of art is different from news, but there’s much we can learn (stay tuned for more posts on that topic). Consider this from Anil’s piece:

Reblogging is essential to getting the word out for many digital artists, but potentially devastating to the value of the very work it is promoting. What’s been missing, then, are the instruments that physical artists have used to invent value around their work for centuries — provenance and verification.

Think of these two key terms he uses.

Provenance. 

Verification.

In the context of news, provenance could be the source of information — or it could be who first reported something. Verification, of course, is already a common term.

The next question then is: What instruments do we have to give our work value?

Not methods. Instruments.

All this — you guessed it — also makes me think of GitHub for News (more here). That idea would make tracking updates, contributions, feedback and even facts more structured by incorporating them in a versioning system like git.

Neither GitHub for News nor Blockchains for News would solve all the problems they aim to tackle. Anil’s piece smartly notes in the art realm:

as with any new idea, it can be difficult to reckon with the implications. Steven Melendez asserted that monegraph could “eradicate fake digital art”, when this is exactly backwards. In fact monegraph makes it possible to have “fake digital art”, because prior to this we had no consistent way of defining an “original”.

So, where should we start?

UPDATE: More discussion and explanation…

PoW = proof of work

Also, just for fun and more Bitcoin background: By reading this article, you’re mining bitcoins