Mojomail – Using Email to Integrate Crowdsourced Information Platforms

What’s a Crowdsourced Information Platform?

Crowdsourced Information Platform is how I refer in my head to the collective of services that have started in recent times to collect news reports, incident reports, personal stories et al in the form of a web post (as in Social Media) or an audio report recorded via an IVR system, a video uploaded via smartphone or even a simple SMS…and all the others that are sure to follow.

Essentially any form of two way communication, with a large user base. Note that one way information dissemination systems (such as TV, print media and IVR systems that dont let you record) don’t fall under my definition of a Crowdsourced Information Platform.

Some examples of Crowdsourced Information Platforms:

  1. CGNet Swara

  2. Video Volunteers

  3. RuaiSMS (and other sites based on FrontlineSMS)

  4. All kinds of Ushahidi websites

  5. Facebook, Twitter et al

  6. Google…all of it

  7. Email  (yep, and I’ll explain…in painful detail)

Why we need integration between platforms

We all know that most of the Web is user generated…simply because everyone on the Web IS a user.  However, as we open more channels of content, the definition of user must undergo some evolution. Different users will have different access mechanisms and therefore potentially radically different views of the same information. In order to be able to  provide some order to the chaos, mechanisms will be required to group and link content.

Attempts are already underway to set up repositories of crowdsourced data. However, the variance in media being the way it is, it is unlikely that any single centralized system will be able to cover everything and everyone.

A distributed system is needed where each node is able to communicate using a minimum acceptable standard. Where a medium cannot meet the peer communication standard, a translator can be introduced as an intervention.

By keeping the peer standard constant, the only requirement to add a new medium to the network remains to write a translator from that medium’s existing communication mode to the network standard.

Our (Mojolab’s) recommendation for that standard is EMail.

Why EMail

  • Email has been around for a looong time. It is one of the earliest applications of networking between computers and has been hammered into shape by, in IT timescales, the weight of millennia.

  • It’s been used effectively for

    • Unicasting – like when you send your wife a one liner from the convention saying “Wish you were here”

    • Multicasting – like when you erroneously copy in your girlfriend

    • Broadcasting – like when facebook tells all your friends by email that you changed your marital status to “Divorced” and your Relationship Status to “Single” on the same day.

  • It can be used to send a payload plus metadata

You can send text and you can send binary data too. This makes email particularly attractive to multiplex different kinds of data, such as audio, video, text and pictures.

  • Mail clients are cheap to write. Python has a nice IMAP library that will do most of your work for you. So do most of the other languages. It’s really easy to script email!

  • Authentication and Security is outsourced

With regard to anything to do with mass usership, authentication and identity management is a nightmare all by itself. With email as the peer communication standard, that nightmare can be left to whoever is running the mail server, which is usually someone competent to handle it (or so we hope)

  • Works asynchronously

Users who live in bandwidth abundant places like Sweden, Palo Alto and Bangalore (among others) will have trouble understanding how important asynchronicity is when you have a connection thats bursty based on everything from weather conditions to political scams.

However, when it comes to bad bandwidth areas, its great when your mail can queue up without holding up your interface and go out all at once when the connection gets better.

 

Case Study: Mojomail – Using Mailman and GMail together to create a Distributed Content Management System

Scenario

CGNet Swara is a crowdsourced news portal for Central India focussing on Adivasi (indigenous) and other marginalized communities. It started off as a pilot in 2010 with a single IVR number linked to a blog, moderated by Shubhranshu Choudhary (Shu). Over time, Shu and his team have trained people in the field to use the platform and also to train others to use it to share relevant stories from the grassroots in media dark regions of Central India.

Users call the number and are presented with the option of recording fresh content, or listening to voice posts left by other users.

Each recording needs to be listened to, verified, quality edited (amplified, cleaned up) and summarized in text by a human moderator. Then it is published to the web interface (the blog) as well as the IVR interface.

The verification process could involve calling back the user and confirming things, or checking with other sources in the vicinity.

Initially, when the platform was new, the volume of incoming calls and consequently the number of audio recordings coming in was low, limited to less than 10 a day. Today the platform receives over 300-400 calls a day and over 60 recordings, each about 2-3 minutes in length.

The moderation effort has therefore increased sixfold.

Moreover, as the community gets more and more comfortable using the platform, they are more and more eager to get involved in the content management process as well as to own, replicate and customize the platform.

In 2012, we deployed 3 additional IVR servers to complement the existing one in Bangalore.

These correspond to the MP/CG, BR and AP telecom circles respectively.

At the same time, we added further lines to the existing server in Bangalore.

In 2013, we added on a further channel to the Bangalore server, called Adivasi Swara, which is in the Gondi language.

To move towards community moderation across all these channels, we needed a system that :

  • is accessible by many users – Loudblog, the existing interface on the IVR servers is multiuser, but not really (as are most content management systems)

  • provides some form of centralized access control,so that the community can choose who to share content with

  • hould work on all kinds of bandwidth, from 2G (about as bad as a 56 k modem) to leased lines (4 MBPs stuff that we dream about)

  • should allow people to contribute in multiple languages and also send back binary information like photos or audio responses to attach to the incoming content

Design strategy

To design and implement a new system of this sort would be a fairly expensive exercise

So we took some shortcuts

  • We used what people already know

GMail

GMail is the interface part of our DMUCMS (Distributed Multi User Content Management System). Everyone who wants to moderate just gives us their GMail address, or creates a new one. Thats all it takes! No training manual, no learning curve while people figure out the form interface…this is the part where the node brings the communication know how. Of course, this raises the bar on being a moderator…you have to know email.

Why we like GMail –

    • Conversation view
    • Drive support
    • Hangouts and chats baked in
    • Multimedia attachments
    • Multilingual
    • Presents an individual view of shared data (mails on a mailing list)

 

  • We used what has worked through generations of other hardware and software

Mailman

We make a mailing list, and put those addresses that people gave us on it. Thats it, no more needed. Thats our content management system database set up.

Why we like Mailman –

    • Minimal
    • Efficient
    • Fully functional
    • Keeps inboxes in sync
  • What we absolutely needed that we couldnt find..we wrote

Mojomail

Mojomail is essentially ABBOTS – A Big Bunch Of Tiny Scripts. It automates the formatting and sending of emails to the list we made in Mailman.

Each of our IVR servers gets two email addresses, one that it sends out on (like a public key) and one which it receives on (private key, known only to server admin).

Every time a new piece of content, i.e. a recording comes in, the server simply creates an email from template, attaches the MP3 of the recording to it and sends it out to the mailing list so that all the moderators get a copy.

Sort of like those very proper people who always write in the same format and include helpful tagging information so that you can search it better

How Email Moderation Works

Thats it. Thats our DMUCMS. The subject lines being preformatted means that GMail nicely organizes recordings into conversations, and each moderator can contribute to the processing of each message and update the conversation. They can also attach edited versions of the recording, other media…pretty much whatever is needed.

Finally, when the conversation reaches completion and the message is ready to publish, the owner of each interface, who is also a member of the list takes the final version and releases it onto their respective interface, such as the IVR, a blog or social media.

 

 

We are now in the process of automating publishing through various interfaces through email, which should be relatively easy since almost everything supports post by email.

Swara MojoMail and Living Data - How it all fits

Whats next?

We are also using email in similar ways in projects in other parts of the developing world.

In summation email based integration strategies allow you to

  • Share what you want

  • With who you want

  • When you want

  • Irrespective of simultaneous bandwidth availability

  • Without having to write too much code

  • Or spending too much capacity building budget

We welcome all support in developing a content sharing standard around email as well as development support for Mojomail as an open source project at http://bitbucket.org/mojolab/mojomail

1 thought on “Mojomail – Using Email to Integrate Crowdsourced Information Platforms”

Leave a Reply

Your email address will not be published. Required fields are marked *