Oct 2, 2013

Grace Hopper 2013: Keynote Panel and Student Opportunity Lab

October 1st:
Thought Minneapolis would be chilly, but nope, just usual autumn weather, nothing much different from Maryland. The Millennium hotel is fine, everything is so easily accessible, we just walked a few blocks and found places to hang out. Dinner at the Jerusalem restaurant was awesome. And oh, did I mention that we went there by a rickshaw driven by a freshman girl from UMN:) That was superb!

October 2nd:
Keynote panel: instead of keynote speech, they made it a QA panel, very interactive and lively. Much focus was given to Lean in circles. Inspirational indeed. Sheryl Sandburg (CEO, Facebook) emphasized much on the role of peers, not just mentors. I agree with this vision, the mentors may not be from same generation or same view point. Peers can be more inspiring as they are of same age and probably similar background and understand the culture and time that we are in.

The QA session was very interesting and thought provoking:

1) My daughter wants to study biology, but I want her to be in CS, what to do? 
2) Should we have conferences for women? Should we consider gender as a factor at all? 
3) How to increase number of female CS major? 
4) What would you change if you had a magic wand?
5) Why freshmen or high school girls do not want to pursue CS?

Student research opportunity lab: tables had separate topics and 10 attendees sat around 2 or 3 expert representative from different professional background. I attended the one with HCI/UX design for the first round table panel. It was a good experience. We has UX designers from Linked in and Chegg discussing with us about the importance of design in product development and why we should put the users first.

May 15, 2013

Visualizing World Languages in Wikipedia

After seeing how the population of a language is not correlated with the volume of internet content of that language (although obvious) in the inforgraphics created by Funders and Founders: http://notes.fundersandfounders.com/post/50347902559/worlds-top-languages-on-the-internet , I decided to collect the number of Wikipedia articles on those languages. So I collected the data from Wikipedia written in the top 10 languages in the world and made these charts using Excel.

Sorted by Wikipedia articles
Sorted by population
After sorting according to the number of wikipedia articles, I see that although Mandarin Chinese is spoken by the highest number of people in the world, if we rank by the number of the wikipedia articles, Chinese goes to the 6th position. The bad thing about Excel is after I sort the columns, it does not retain the same color for the same language:(

Then again, I decided to see the percentage of articles in those languages, not the percentage in total, rather the proportion among the articles in these languages. So I drew this pie chart:

Visualizing World Languages and Bangla Content in Internet

Found this visualization in my facebook feed shared by some of my friends, and also in my lab's mailing list:

Funders and Founders made this visualization that compares the world language based on their population and internet-content. I am attaching the image from their website:

I would like to know more about the data and what they mean by Language used in Internet: language in text? Language in video/ music? Everything? For the case of internet, another interesting observation would be to see which language is generating more content now than before, because definitely, when internet started, everything was in English. 

I see Bengali is on the 7th position, way to go population growth!  At least we are among top ten languages, based on number of people. But then I look at the bar graph at the bottom, and can see the discrepancy. I am glad that Chinese and Spanish have some correlation with the number of people and the volume of internet-content. The problem with the visualization is, the circle shows the number, not the percentage. The bar shows the percentage, not the number. Although in both cases, I would like to see both percentage and number. And also I would like to see both the data in similar visualization, I do not see any point why one is a circle and other is a bar. It's difficult to compare the % of world population and % of internet-content. But I appreciate the effort to make a point with this visualization. Although I am confused about the thought-bubble with a person with turban, another one with farmer's hat, and another with a baseball cap: the content-creator is thinking about all of them? So s/he is trying to include internationalization features [for Chinese/Shikh/American people] ?

Makes me sort of sad though. Don't know about others, but in case of Bengali, the Wikipedia movement is trying to create more contents in Bengali, but one thing is, people who can afford a computer+internet, already know workable English (because they are privileged to go to school and have electricity), so they don't feel the need to have Bengali contents. And people who actually need it in Bengali cannot afford computers. When our higher level textbooks/exams are also in English, when it comes to search for academic documents, it makes more sense to search in English. We cannot deny the fact that the lingua franca of modern day academic publication/ research is English. 

Although I wonder how the chart will look like if they only consider social/creative content (like blog-posts, Facebook posts). The standard Unicode for Bengali font started around 2001, before that people had to write Bengali using Roman script. Phonetic keyboard made it lot more easier than before so we can easily type in Bengali without using a Bengali keyboard layout. It is easier than before, but still not as easy as typing in English. 

Bengali speaking-area.
 soource Wikipedia:
Most Bengali content I came across over the internet are: 
       Text: newspaper articles, blogs, social media status update, text books.
       Video: Bengali songs, movie, or TV drama(Youtube), video lectures (Shikkhok.com). 
       Audio: Music sites sharing mp3s.

The blog posts and news paper articles are written in Bengali scripts, but the Facebook posts or Tweets are not always written using Bengali scripts, most people use the Roman scripts and transliterate Bengali into Roman alphabet, probably due to the Roman Keyboard and the lack of ease to use Bangla writing softwares. First of all, we don't need to install any software to use Roman alphabet, and people who seldom writes in Bangla, might not be interested to use such softwares. 

Another reason is familiarity: we do not have Bangla writing options in cell phones (at least in most cases), so people are already used to text in Bangla using Roman alphabet, and they do the same when they update status in Facebook. 

Moreover, there is no plugin that lets me write Bangla in-place when I update my status in Facebook/ Twitter. I have to type it somewhere else and then copy-paste the text in the status update text-area. 

Also, a big issue is to reach the appropriate audience. Almost everyone understands English ... OK, almost everyone using the internet understands English. So am I loosing some audience when I am writing in my native language? Or may be I can gain more Bangali audience by writing in Bangla. 

Now I come back to the original point, the privileged people may not care about more Bangla content, and the people who need Bangla content the most are living on the edge, so we cannot make a business model depending on that population. Then who are the consumers? If there is some e-commerce website in Bangla how will it compete against the English ones? Or do we even need such websites in Bangla? I am not aware of the current situation of mobile apps in Bangladesh, so I cannot actually tell whether the fishermen or farmers are actually using mobile internet to sell their products or not. But if that scheme turns out to be cost-effective for them, then that's a huge area to develop Bangla mobile apps.

Still, I am optimistic about the use of Bangla in internet. Now we have a full fledged search engine Pipilika (the ant): http://pipilika.com/  where we can search for Bangla contents.  We learn the best when we learn in our mother-tongue, so definitely there is need to have more Bangla contents that will help us learn. And no matter what, we think in our own language, so when it comes to share our creativity, we would prefer our own language. When I see how the Bangla blogosphere is emerging (to the point where government considers to censor it), I feel good about the future of creative contents in Bangla on internet. I mean, aren't we the people who love to write poems and fictions! 

Feb 19, 2013

Activity on #Shahbag in Twitter: Part 5: Two significant groups emerged

Last time I collected the Tweets on Feb 18th. The death of a blogger activist shook the nation and we see many related posts on Twitter. The above image in NodeXL shows the network generated by reply, mention, and retweet relation, I did not include the follow relation here. Now I noticed several groups, and some people are central within their groups, and the connected people are not necessarily of similar opinion. Then what makes them connected? Why people with different opinion are in same group and people with similar opinion are in separate groups?  It's the reply and mention. After I read the tweets, it was clear that they are trying to make their points to the people of opposite opinion. So some people who are highly supporting the #Shahbag movement are trying to make the non-supporters realize their stand, on the other hand the opposing group was also communicating with the supporting group with their point of view. So, people with opposite opinion were communicating for argument.

As, this graph did not contain the follow relationship, we do not see strong ties or grouping. Because people usually follow like minded people, even if they mention or reply others who have opposite opinion. So this time I polled the network including both follower relation and reply/mention relation. And then I grouped them using clustering algorithm. After it was grouped including the follower relationship, we can see two major groups.
This graph considers both follower and reply/mention relationship, and therefore, the clustering was different from the previous one. I am only showing the follow relationship with orange color links. Now the people in the left group (136 people) are connected to each other and they mostly follow people within this book. Their tweets contain news, url about the protest, sometimes historical documents regarding war crimes, asking for punishment for the war-criminals. On the other hand, the right side group with 68 people, are opposing the movement, and arguing that it is led by the government. What was not obvious from the previous reply/mention graph, became apparent from the follow relationship graph: who belongs to what side of the movement.

But, hey we also see some orange lines running between two groups, that means there are people from one group following people from another, does this mean that people follow someone who has opposite political stand ? I wanted to identify who are the connectors of the both groups. I sorted the users (nodes) based on their betweenness centrality and later in-degree (number of links coming to them) in the graph, and the highest was AlJazeera (with maximum number of people connected to this account), which is a news agency. And people from both sides are following the AlJazeera account, thus making it a strong connector between two groups (the algorithm placed AlJazeera in the left group as it had more connection with AlJazeera).  Then I realized other news agencies might have similar roles: I found another news agency also had strong connection within both groups. Another account, though belonging to the supporter group, had lots of followers from the opposing group, that was something interesting to observe.

I selected AlJazeera and moved it near the border between both groups, and now we see how it also
dragged a lot of orange links with it (middle-top), and so, now in the middle of the border became thinner, we see fewer links there. We can say, AlJazeera is a bridge node.

But it was not the only account in the border line, I also found out some other accounts who have followers/followees from both groups. I dragged them near the border showing their connections with both sides.

Links from those accounts are highlighted in green color.

After that, I removed the biggest bridge node AlJazeera and see how the graph changes: compare with the second image in this post and see how the bridge nodes make lots of difference understanding the network dynamics. Some other bridge nodes are the users who are not taking any particular side and it is not clear what their opinion is regarding the movement. They seem neutral or less aggressive but they are also tweeting heavily about he event, hence became important bridge nodes. People from both groups follow them and mentioned/replied-to them.

Feb 15, 2013

Activity on #shahbag in Twitter: part 4

Analysis of Feb 15:

Here people are grouped by their connection and follower-followee relationship. No wonder that at this stage of the movement there are people from both sides, people who are supporting the movement and people who are opposing it. Today (Feb 15), a blogger-activist was murdered who was a supporter of the movement.

The clustered sub-graphs clearly show the difference of opinion and the activeness of people in both sides. I selected the tweets from group 1, the biggest connected group, they follow each other and retweet each others' tweets. Their tweets contain news and anger about the murder of the blogger. The hashtags they used are: bangladesh, jamaat, shibir, thabababa, islam, terrorist, razakar. The top links they shared are the genocide archive of bangladesh, interactive timeline of bangladesh liberation war, and the news about the murder of the blogger.

In contrast, the people in group 4 (the second largest group,  top-right side next to the biggest one) seems to have different opinion, they are opposing the shahbag movement, claiming that the murder of the blogger happened either by govt/police or internal clash. This group has its own follower base and in their tweets they are supporting jamat/shibir and they are also talking about matters different from the movement, like padma bridge, bdr, amnesty, etc.

Here, for anonymity, I have used sphere shape instead of profile pictures, and the bigger nodes are the important ones. Different colors indicate different timezone, here red is USA and Green is Bangladesh. So in both groups, we can see people from different countries are involved.

Using the Twitter stream graph, we can see the trend and what people are talking today: http://www.neoformix.com/Projects/TwitterStreamGraphs/view.php (You can go to this link to create your own)

I omitted the names of the users to prevent their anonymity. But the stream-graph shows what were prevalent topics in Twitter today.

Feb 12, 2013

Activity on #Shahbag in Twitter: part 3

Feb 13:

I am posting on Feb 13 analysis:

Here the icons are sized proportional to the number of times they engaged in a conversation (reply, being replied to, mentioned, being mentioned by someone, retweeted, etc.). And their horizontal position is related to their timezone, so people in USA are in left side, people in Bangladesh are in right side. Also, people at the top newly joined twitter, people at the bottom are long term twitter users (I doubt how often they actually used twitter before this movement!). I selected the 'most talked about' person in this graph and the people who are talking with him or about him (connections shown in red). Any idea why this person is so active and mentioned so many times ? Also, see how some new users became so active within just a few days ?

Feb 12:
I am posting about the data I collected today, Feb 12.

I looked at the graph and realized some people are tweeting alone and they are not connected to other people by retweeting or mentioning others, rather they are kind-of island nodes in the graph. I clustered the graph according to the connected component, drew all the disjoint people at the bottom of the graph, and now we can see that lots of people are just tweeting all by themselves, they are not followed by or following other people.

In Twitter a lot of activities revolve around retweeting, replying to each other, resharing urls and news. If people want to spread a news about some event or topic, then just tweeting new tweets can be exhausting. While some people are surrounded by lots of followers and retweeters, a lot of users either do not know how to connect with others by following them or retweeting them.

Feb 10, 2013

Activity on #Shahbag in Twitter: the people, what they are talking, what they are sharing. Part 2

Today, BCC (Bangladesh cricket board) was among the top mentioned accounts within all the tweets that had the #shahbag hashtag. Also, some people are using #shahabag or #shahbagh. It will be better to stick to just one to avoid confusion and to make the trend consistent.

Part 3 here.
Part 4 here.

This time I am looking at how users around the world getting connected through their tweets. Each box is the users from a country. The more people tweeting from that country, the larger the box is. The largest one is indeed the people tweeting from Bangladesh, and our very own Tamim Iqbal is one of the biggest node in that box. Why ? Because here, the sizes are proportional to the number of times other people retweeted their tweets or mentioned them in their tweets. Tamim himself made one tweet on #shahbag, but that one tweet is retweeted 27 times. I am not sure if this is the actual official Twitter account of Tamim or not, but that's definitely inspirational. As today Bangladesh cricket team also expressed their solidarity with the movement, lots of tweets were about that, too. Our other celebrities should also come forward like this.

#Shahbag tweets connecting the world. 
The thick connections between the groups show that people from Bangladesh are connected with the rest of the world, the news from Dhaka are spread around by people in other countries in the form of retweets and replies. My first intuition was that may be mostly the Bangladeshis who are staying abroad are tweeting more, as Twitter is not that popular in Bangladesh (which may have changed in these few days). But this graph proved me wrong, It clearly shows that the biggest group is tweeting from Bangladesh. We are definitely not far behind when it comes to utilizing social media for national priority.

You may not be able to read the names of each country clearly from the image, so I am going to list them here (in decreasing number of people):

Bangladesh, USA, England, India, France, Australia, UAE, Belgium, Kuwait, Colombia, Israel, Singapore, Holland, Nepal, Morocco, Denmark, Thailand, Germany.

Remember that, this is just a snapshot of one hour, this does not include each and every tweet. But gives us a  quick picture of the story.

Feb 9, 2013

Activity on #Shahbag in Twitter: the people, what they are talking, what they are sharing. Part 1

As the movement of Bangladeshis is gaining its momentum, more and more Bangladeshis are using social network services like Twitter and Facebook to discuss about the movement, sharing videos, organizing events regarding this movement.

For last three days I am collecting the tweets and the Twitter networks of the people who are Twitting with the hashtag #Shahbag. And many of the tweets are from very new users, those people joined Twitter with the spirit to share the words, to tell the world how they feel about this movement, I could see that they still do not have any profile picture, so Twitter used the default 'egg' icon for their profile pictures. Twitter might notice that how an event like this can increase their user base.

At the end of the day in Feb7, I drew this Twitter network using the Social network analysis tool called NodeXL:
Not that big, few people, and not many tweets, but let's see what happened the next day:
Now we have so many people that I had to use some encoding to keep all the people in the image, so I resized their profile icons: if they tweeted more, I made their icons bigger, and they are also in the middle. If they tweeted less, their profile icon in the image got smaller. And you also see the links between the people: these links indicate that they either retwewted, replied or mentioned each other. So, when A retweets his friend B's tweet, it draws an arrow from A to B.

But, now as you already guessed, there are more participation, more tweets, more people. Today when I pulled the data of all these people, I noticed that the graph looked like a hairball, so I tried to organize the people and their Twitter activity by groups. How is that ? Some people know each other personally or follow each other on Twitter, they reply to their Twitter friends, mention them and retweets each others' tweets, so they are connected with links. For example, you can retweet our friends' tweets,  reshare the URL they share in Twitter. And  your other followers also reshare or retweet the things you tweet or share. So now we all get connected this way. We will notice that now small clusters are formed within all those users. This is waht I got after making the groups of people who are more connected together (Fe 9th, 2013). And inside each group we can see one or two prominent people who are more active within their own group, so they are basically leading their group by creating more useful content that their friends are resharing or retweeting.

Also today's top most shared URL in Twitter regarding the #shahbag movement are as here:


Jan 31, 2013

Presenting Conference Monitor at Social Informatics 2012

This was the first annual social infomatics conference, and it was in DC. I was hoping that my paper will get accepted. I always hope the same for all my papers and all the conference, but this time I was very optimistic, I worked a lot on this. Finally I was very happy after getting the acceptance email and very positive reviews from the committee. My flight for Dhaka was just right after the last day of the conference, so I was preparing my talk at the same time I was preparing my self for my trip to home. I am glad that my colleagues at the HCIL helped me with the practice talk. This is a culture I highly value here, getting feedback from our peer before we actually give the talk at a conference.  After the practice talk, I changed my presentation a lot, and later made a demo video of the tool that I presented at the conference. It helped a lot to show the demo: it made clear what my research was and what the tool does. The conference did not have any separate demonstration session, so I showed the demo of conference Monitor during my talk. I was overloaded with coffee, it was right after the final exam of Information Visualization course, I was the TA of the course. After the final exam, I collected all the exam papers, took the metro to the conference venue and had one more hour before my talk. The talk went better than I expected, I received positive feedback on my research topic, other researchers showed interest on the tool and wanted to use it for their own research. And, above all, no one fell asleep during my talk, what more can you expect ?:) I demonstrated how Conference Monitor can be used to analyze and visualize tweets during an event in real time, and how it can identify the influential and active people in the back-channel communication during an event.

I could not attend the second day, I needed to grade the final exam paper, but at least I could attend the key note by Noshir Contractor. From the keynote speech and the three sessions I attended, I learned about new techniques and metric to analyze social network and people'e influence on a network. After I felt that even though many people are using visual analytics for network visualization, their analysis and presentation could be more understandable if they knew more about netviz nirvana, you can do a lot just by following some principles of removing node clutter and edge overlaps. You can use visualization to make your presentation look flashy and cool,  but that is not the visual analytics. You know how important your research is when you see how other people are not doing it right. And again, I felt as a visual analytic researcher how can we make people realize the importance of using visualization in the RIGHT way?


Jan 30, 2013

Understanding the Cognitive Walkthrough Process

In one of my interviews for summer research internship, I was asked how I will evaluate my software without running a usability study or expert feedback, when should I be confident that it is ready. Yes, I can always use my own judgement, but that does not count as a scientific method. If I am the only person to evaluate a UI without using any user experiment pr expert feedback, what method I can rely on to evaluate the UI ? I knew about the Cognitive Walkthrough process to evaluate user interface without the involvement of the users, but didn't actually use this on any project. I thought I should better understand it properly this time. 

Cognitive Walkthrough:
Objective is to evaluate the usability of a user interface.
Does not require user study. Can be evaluated using the early prototype of a tool.
The designer of the interface can perform this evaluation.

But what is this walkthrough ? First, look back in the cognitive theory: how do users interact with an interface without prior knowledge about the system and without any learning ? They have a particular goal in mind, say, search a word in a document. Then they scan the UI and look for interface elements that might serve the purpose, for example, if there is a button with the label 'search', or if there is a search option in the right click menu bar. Then, if available, they select that button, or click the 'search' option from the menu bar. What happens after that? If the button is actually for searching word in the document, it will give feedback to the user to enter the word, and then perform the search action, after finishing searching, it might show dialog box with how many times the word appears in the document and highlight the occurrences. On the other hand, that search button may not be an option for word search, rather it might be a search option for other files in the directory. So the user should get feedback from the system that it is a file search input. This process of setting goal, searching in interface, selecting interface element, and processing feedback from the interface comprise the cognitive process of users interacting with an interface.

For evaluating an interface, designers and their peers follow the same model, which we can call a Cognitive Walkthrough. So they define a goal, suggest which are the possible course of action that a user will possibly follow to reach the goal, and evaluate the likelihood of a user to select that course of action, and then finally evaluate the system's feedback to the user.

The evaluators can be UI experts or designer.
-They first understand the user's goal (searching a word), whether the goal is clear to the users or not, whether the user will know what to do to get what they intend,
-the accessibility of the control designed for the goal (if the search button is easily visible or accessible),
-whether the goal and the control labels are appropriate (if the label or tooltip correctly says that it is for searching word inside the document), and
-whether the feedback provided by the action is understandable for the users (the button should open an input dialog box for word entry, after searching it should say that the search is complete, show the count and highlight the words, or say if there is no match).

So finally a Walkthrough Evaluation Sheet contain these above mentioned 4 criteria for evaluation. This stage of evaluation can be done in early stage of the design and development, even using paper prototype, it can detect early design flaws and it's cheaper than recruiting users for a usability test. However, it cannot fully replace the usability study;  it can miss lots of usability issues due to false assumptions about the users, incomplete description and decomposition of the tasks, and finally, the real interface may not be the same as the early prototype.

Reference: http://www.sigchi.org/chi95/proceedings/tutors/jr_bdy.htm