Intro blurb Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Pellentesque enim tortor, blandit vitae, dignissim sed, lobortis vitae, quam. Donec in enim. Curabitur vitae orci. Pellentesque lobortis arcu sit amet turpis. Sed sit amet massa sit amet mi condimentum porta. Cras urna risus, nonummy sed, ultricies ut, mollis vitae, lorem. Fusce ultrices metus eu tellus. Vestibulum ac risus a augue tincidunt tempor. In hac habitasse platea dictumst. Morbi varius. Donec eget nunc. Nam vel felis sit amet elit pulvinar molestie. Proin purus justo, vulputate sit amet, pretium vitae, dignissim at, ipsum. Mauris mattis mauris quis tellus.

Workflow

The full process from raw data to the final visualization is outlined below:

  1. Primary data processing and audit
  2. Secondary data acquisition and preparation
    1. Translate titles using Google Sheets
    2. Correlate origin with country data
    3. Scrape URLs from arkyves.org
  3. Constructing composite SQLite database
  4. SQL query to aggregate a two column CSV file of poster id, and a list of pipe delimitedic_stem as ic_list
  select p.id as id, 
         group_concat(ic_stem,"|") as ic_list
     from posters as p 
     join ic_obsv as o, 
         ic_codes as c 
     on p.id = o.id and c.ic_en_ = o.ic_en_;
  1. Import the CSV into Sci2
  2. Extract co-occurrence network from the column ic_list
  3. Save as graphML
  4. Open graphML with Gephi
  5. Export the nodes as a CSV
  6. Import nodes into SQLite database
  7. SQL query to join poster data to node IDs, save output as node data CSV
 CREATE VIEW node_data AS select 
        p.id as poster_id, inventory, title, title_en, 
        p.origin as origin, country, region, 
        year_from, year_to, (year_to - year_from + 1) as lifespan,
        group_concat(distinct language) lang_list,
        url, img,
        group_concat(ic_stem) as ic_stem_list,
        group_concat(distinct ic_d1) ic_d1_list,
        round(avg(ic_depth),2) as ic_depth_avg,
        Label, g.Id as Id 
    from posters as p
    left join gephi_nodes as g on p.id=Label
    left join origin_names as n on p.origin = n.origin
    left join ic_obsv as o on o.id = p.id
    left join ic_codes as c on o.ic_en_ = c.ic_en_
    left join ic_codes_all as a on ic_d2 = a.notation
    left join language_obsv as l on p.id = l.id
    group by g.Id
  1. Join node data CSV to Gephi network from step 8
  2. Partition by region:
    {'North America': 'lightblue', 
    'Europe': 'darkblue', 
    'Africa': 'red', 
    'South America':'pink', 
    'Asia':'yellow', 
    'Oceania': 'green'}
  1. Size by year_from:
  {'1983':1, '2012':50}
  1. Apply layout: Force Atlas 2, default settings
  2. Export Sigma.js web application. Include search and image url 'img'
  3. Add custom functionality
    1. Highlight active node
    2. Show inactive nodes as grey
  4. Add custom style, UI layout, and legend
  5. Deploy on web host ivmooc-mrmattsim.rhcloud.com

Visual encoding

  • spatial distribution: coöccurrence
  • palette: region
  • size: year_from

Interaction

  • active node
  • highlight active node
  • gray basemap
  • search

 


Next >> Visualization