May 15, 2013

Visualizing World Languages in Wikipedia

After seeing how the population of a language is not correlated with the volume of internet content of that language (although obvious) in the inforgraphics created by Funders and Founders: http://notes.fundersandfounders.com/post/50347902559/worlds-top-languages-on-the-internet , I decided to collect the number of Wikipedia articles on those languages. So I collected the data from Wikipedia written in the top 10 languages in the world and made these charts using Excel.

Sorted by Wikipedia articles
Sorted by population
After sorting according to the number of wikipedia articles, I see that although Mandarin Chinese is spoken by the highest number of people in the world, if we rank by the number of the wikipedia articles, Chinese goes to the 6th position. The bad thing about Excel is after I sort the columns, it does not retain the same color for the same language:(

Then again, I decided to see the percentage of articles in those languages, not the percentage in total, rather the proportion among the articles in these languages. So I drew this pie chart:


Visualizing World Languages and Bangla Content in Internet

Found this visualization in my facebook feed shared by some of my friends, and also in my lab's mailing list:


Funders and Founders made this visualization that compares the world language based on their population and internet-content. I am attaching the image from their website:

I would like to know more about the data and what they mean by Language used in Internet: language in text? Language in video/ music? Everything? For the case of internet, another interesting observation would be to see which language is generating more content now than before, because definitely, when internet started, everything was in English. 

I see Bengali is on the 7th position, way to go population growth!  At least we are among top ten languages, based on number of people. But then I look at the bar graph at the bottom, and can see the discrepancy. I am glad that Chinese and Spanish have some correlation with the number of people and the volume of internet-content. The problem with the visualization is, the circle shows the number, not the percentage. The bar shows the percentage, not the number. Although in both cases, I would like to see both percentage and number. And also I would like to see both the data in similar visualization, I do not see any point why one is a circle and other is a bar. It's difficult to compare the % of world population and % of internet-content. But I appreciate the effort to make a point with this visualization. Although I am confused about the thought-bubble with a person with turban, another one with farmer's hat, and another with a baseball cap: the content-creator is thinking about all of them? So s/he is trying to include internationalization features [for Chinese/Shikh/American people] ?

Makes me sort of sad though. Don't know about others, but in case of Bengali, the Wikipedia movement is trying to create more contents in Bengali, but one thing is, people who can afford a computer+internet, already know workable English (because they are privileged to go to school and have electricity), so they don't feel the need to have Bengali contents. And people who actually need it in Bengali cannot afford computers. When our higher level textbooks/exams are also in English, when it comes to search for academic documents, it makes more sense to search in English. We cannot deny the fact that the lingua franca of modern day academic publication/ research is English. 

Although I wonder how the chart will look like if they only consider social/creative content (like blog-posts, Facebook posts). The standard Unicode for Bengali font started around 2001, before that people had to write Bengali using Roman script. Phonetic keyboard made it lot more easier than before so we can easily type in Bengali without using a Bengali keyboard layout. It is easier than before, but still not as easy as typing in English. 


Bengali speaking-area.
 soource Wikipedia:
http://en.wikipedia.org/wiki/Bengali_language
Most Bengali content I came across over the internet are: 
       Text: newspaper articles, blogs, social media status update, text books.
       Video: Bengali songs, movie, or TV drama(Youtube), video lectures (Shikkhok.com). 
       Audio: Music sites sharing mp3s.


The blog posts and news paper articles are written in Bengali scripts, but the Facebook posts or Tweets are not always written using Bengali scripts, most people use the Roman scripts and transliterate Bengali into Roman alphabet, probably due to the Roman Keyboard and the lack of ease to use Bangla writing softwares. First of all, we don't need to install any software to use Roman alphabet, and people who seldom writes in Bangla, might not be interested to use such softwares. 

Another reason is familiarity: we do not have Bangla writing options in cell phones (at least in most cases), so people are already used to text in Bangla using Roman alphabet, and they do the same when they update status in Facebook. 

Moreover, there is no plugin that lets me write Bangla in-place when I update my status in Facebook/ Twitter. I have to type it somewhere else and then copy-paste the text in the status update text-area. 

Also, a big issue is to reach the appropriate audience. Almost everyone understands English ... OK, almost everyone using the internet understands English. So am I loosing some audience when I am writing in my native language? Or may be I can gain more Bangali audience by writing in Bangla. 

Now I come back to the original point, the privileged people may not care about more Bangla content, and the people who need Bangla content the most are living on the edge, so we cannot make a business model depending on that population. Then who are the consumers? If there is some e-commerce website in Bangla how will it compete against the English ones? Or do we even need such websites in Bangla? I am not aware of the current situation of mobile apps in Bangladesh, so I cannot actually tell whether the fishermen or farmers are actually using mobile internet to sell their products or not. But if that scheme turns out to be cost-effective for them, then that's a huge area to develop Bangla mobile apps.

Still, I am optimistic about the use of Bangla in internet. Now we have a full fledged search engine Pipilika (the ant): http://pipilika.com/  where we can search for Bangla contents.  We learn the best when we learn in our mother-tongue, so definitely there is need to have more Bangla contents that will help us learn. And no matter what, we think in our own language, so when it comes to share our creativity, we would prefer our own language. When I see how the Bangla blogosphere is emerging (to the point where government considers to censor it), I feel good about the future of creative contents in Bangla on internet. I mean, aren't we the people who love to write poems and fictions! 


Feb 9, 2013

Activity on #Shahbag in Twitter: the people, what they are talking, what they are sharing. Part 1

As the movement of Bangladeshis is gaining its momentum, more and more Bangladeshis are using social network services like Twitter and Facebook to discuss about the movement, sharing videos, organizing events regarding this movement.

For last three days I am collecting the tweets and the Twitter networks of the people who are Twitting with the hashtag #Shahbag. And many of the tweets are from very new users, those people joined Twitter with the spirit to share the words, to tell the world how they feel about this movement, I could see that they still do not have any profile picture, so Twitter used the default 'egg' icon for their profile pictures. Twitter might notice that how an event like this can increase their user base.

At the end of the day in Feb7, I drew this Twitter network using the Social network analysis tool called NodeXL:
Not that big, few people, and not many tweets, but let's see what happened the next day:
Now we have so many people that I had to use some encoding to keep all the people in the image, so I resized their profile icons: if they tweeted more, I made their icons bigger, and they are also in the middle. If they tweeted less, their profile icon in the image got smaller. And you also see the links between the people: these links indicate that they either retwewted, replied or mentioned each other. So, when A retweets his friend B's tweet, it draws an arrow from A to B.

But, now as you already guessed, there are more participation, more tweets, more people. Today when I pulled the data of all these people, I noticed that the graph looked like a hairball, so I tried to organize the people and their Twitter activity by groups. How is that ? Some people know each other personally or follow each other on Twitter, they reply to their Twitter friends, mention them and retweets each others' tweets, so they are connected with links. For example, you can retweet our friends' tweets,  reshare the URL they share in Twitter. And  your other followers also reshare or retweet the things you tweet or share. So now we all get connected this way. We will notice that now small clusters are formed within all those users. This is waht I got after making the groups of people who are more connected together (Fe 9th, 2013). And inside each group we can see one or two prominent people who are more active within their own group, so they are basically leading their group by creating more useful content that their friends are resharing or retweeting.




Also today's top most shared URL in Twitter regarding the #shahbag movement are as here:
http://en.wikipedia.org/wiki/2013_Shahbag_Protest
http://www.bbc.co.uk/news/world-asia-21383632
http://bbc.in/11uKAHG
http://www.flickr.com/photos/enamulhoque1/sets/72157632724551111/
http://www.thedailystar.net/newDesign/news-details.php?nid=268446
http://shar.es/YpIdO
http://bdnews24.com/bangladesh/2013/02/07/more-and-more-protests-staged
http://youtu.be/6TCEvGlvfKQ

5

Jan 31, 2013

Presenting conference Monitor at Social Informatics 2012

This was the first annual social infomatics conference, and it was in DC. I was hoping that my paper will get accepted. I always hope the same for all my papers and all the conference, but this time I was very optimistic, I worked a lot on this. Finally I was very happy after getting the acceptance email and very positive reviews from the committee. My flight for Dhaka was just right after the last day of the conference, so I was preparing my talk at the same time I was preparing my self for my trip to home. I am glad that my colleagues at the HCIL helped me with the practice talk. This is a culture I highly value here, getting feedback from our peer before we actually give the talk at a conference.  After the practice talk, I changed my presentation a lot, and later made a demo video of the tool that I presented at the conference. It helped a lot to show the demo: it made clear what my research was and what the tool does. The conference did not have any separate demonstration session, so I showed the demo of conference Monitor during my talk. I was overloaded with coffee, it was right after the final exam of Information Visualization course, I was the TA of the course. After the final exam, I collected all the exam papers, took the metro to the conference venue and had one more hour before my talk. The talk went better than I expected, I received positive feedback on my research topic, other researchers showed interest on the tool and wanted to use it for their own research. And, above all, no one fell asleep during my talk, what more can you expect ?:) I demonstrated how Conference Monitor can be used to analyze and visualize tweets during an event in real time, and how it can identify the influential and active people in the back-channel communication during an event.

I could not attend the second day, I needed to grade the final exam paper, but at least I could attend the key note by Noshir Contractor. From the keynote speech and the three sessions I attended, I learned about new techniques and metric to analyze social network and people'e influence on a network. After I felt that even though many people are using visual analytics for network visualization, their analysis and presentation could be more understandable if they knew more about netviz nirvana, you can do a lot just by following some principles of removing node clutter and edge overlaps. You can use visualization to make your presentation look flashy and cool,  but that is not the visual analytics. You know how important your research is when you see how other people are not doing it right. And again, I felt as a visual analytic researcher how can we make people realize the importance of using visualization in the RIGHT way?

.  

Jan 30, 2013

Understanding the Cognitive Walkthrough Process

In one of my interviews for summer research internship, I was asked how I will evaluate my software without running a usability study or expert feedback, when should I be confident that it is ready. Yes, I can always use my own judgement, but that does not count as a scientific method. If I am the only person to evaluate a UI without using any user experiment pr expert feedback, what method I can rely on to evaluate the UI ? I knew about the Cognitive Walkthrough process to evaluate user interface without the involvement of the users, but didn't actually use this on any project. I thought I should better understand it properly this time. 

Cognitive Walkthrough:
Objective is to evaluate the usability of a user interface.
Does not require user study. Can be evaluated using the early prototype of a tool.
The designer of the interface can perform this evaluation.

But what is this walkthrough ? First, look back in the cognitive theory: how do users interact with an interface without prior knowledge about the system and without any learning ? They have a particular goal in mind, say, search a word in a document. Then they scan the UI and look for interface elements that might serve the purpose, for example, if there is a button with the label 'search', or if there is a search option in the right click menu bar. Then, if available, they select that button, or click the 'search' option from the menu bar. What happens after that? If the button is actually for searching word in the document, it will give feedback to the user to enter the word, and then perform the search action, after finishing searching, it might show dialog box with how many times the word appears in the document and highlight the occurrences. On the other hand, that search button may not be an option for word search, rather it might be a search option for other files in the directory. So the user should get feedback from the system that it is a file search input. This process of setting goal, searching in interface, selecting interface element, and processing feedback from the interface comprise the cognitive process of users interacting with an interface.

For evaluating an interface, designers and their peers follow the same model, which we can call a Cognitive Walkthrough. So they define a goal, suggest which are the possible course of action that a user will possibly follow to reach the goal, and evaluate the likelihood of a user to select that course of action, and then finally evaluate the system's feedback to the user.

The evaluators can be UI experts or designer.
-They first understand the user's goal (searching a word), whether the goal is clear to the users or not, whether the user will know what to do to get what they intend,
-the accessibility of the control designed for the goal (if the search button is easily visible or accessible),
-whether the goal and the control labels are appropriate (if the label or tooltip correctly says that it is for searching word inside the document), and
-whether the feedback provided by the action is understandable for the users (the button should open an input dialog box for word entry, after searching it should say that the search is complete, show the count and highlight the words, or say if there is no match).

So finally a Walkthrough Evaluation Sheet contain these above mentioned 4 criteria for evaluation. This stage of evaluation can be done in early stage of the design and development, even using paper prototype, it can detect early design flaws and it's cheaper than recruiting users for a usability test. However, it cannot fully replace the usability study;  it can miss lots of usability issues due to false assumptions about the users, incomplete description and decomposition of the tasks, and finally, the real interface may not be the same as the early prototype.

Reference: http://www.sigchi.org/chi95/proceedings/tutors/jr_bdy.htm





Nov 16, 2012

Websites to generate color theme

I use this one for creating poster: 
Kuler from Adobe. https://kuler.adobe.com/#themes/rating?time=30

Here is another one I was suggested but haven't used yet:  
Color Theme Designer.  http://colorschemedesigner.com/
this one has a checking for color blindness.

Also it's always fascinating to read blog posts from EagerEyes: 
http://eagereyes.org/blog/2011/you-only-see-colors-you-can-name