Tools and methods for capturing Twitter data during natural disasters, by Axel Bruns and Yuxian Eugene Liang. First Monday, Volume 17, Number 4 - 2 April 2012

Abstract: During the course of several natural disasters in recent years, Twitter has been found to play an important role as an additional medium for many–to–many crisis communication. Emergency services are successfully using Twitter to inform the public about current developments, and are increasingly also attempting to source first–hand situational information from Twitter feeds (such as relevant hashtags). The further study of the uses of Twitter during natural disasters relies on the development of flexible and reliable research infrastructure for tracking and analysing Twitter feeds at scale and in close to real time, however. This article outlines two approaches to the development of such infrastructure: one which builds on the readily available open source platform yourTwapperkeeper to provide a low–cost, simple, and basic solution; and, one which establishes a more powerful and flexible framework by drawing on highly scaleable, state–of–the–art technology.

… The value of community and serendipity is what’s driving the wild-fire emergence of hybrid workspaces in Boston, Cambridge and Somerville. In places like Artisan’s Asylum, experimentation and entrepreneurship intersect, engineers work next to artists and their collaborations fuel creativity …

Entrepreneurs who aren’t lucky enough to catch the eye of a venture capital firm on their own can compete for MassChallenge’s annual one-million-dollar prize. The money is just a nominal motivator; the real rewards are the networking opportunities and office resources allocated to some 125 startups selected to share a 27,000-square-foot floor, donated by developer Joe Fallon, in a Fan Pier high-rise. Started in 2010 by business school graduates John Harthorne and Akhil Nigam, MassChallenge’s hodgepodge of startups gets mentors from partner organizations and rare access to top investors. The office is a beehive of activity, where finalists work alongside peers with projects in a range of high-growth industries. They have use of legal advice, office cubes and a whiteboard so massive it’s being certified by Guinness World Records. Collaborations range from a biotech developing treatments for blindness to a company that delivers artisanal wines to customers’ homes. And for motivation, the entrepreneurs just need to look out their floor-to-ceiling windows on the waterfront to imagine themselves as masters of the universe.

Academia, too, is seeing a new generation of workspaces, like the Harvard i-lab (Hi for short) in the Allston building that was formerly home to WGBH. Director Gordon Jones describes the i-lab, which opened last November, as a startup, an experiment in bringing together students and resources from schools across the university. Harvard students have access to the center’s classes and its experts, but retain rights to their own intellectual property. The center also hosts workshops like “Startup Secrets: Company Formation,” taught by venture capitalist Michael Skok, with some of the seats reserved for the public. A circular open-floor plan, exposed ceiling and IdeaPainted surfaces sprinkled with inspirational quotes evoke the dorm rooms and hacker spaces that might lure the next Mark Zuckerberg, and maybe even keep him from dropping out this time. 

“Part of what this is about,” Jones says, “even in the building design, is high modularity. It’s intentionally unfinished. This is about all of us building this place together.” Instead of an office, Jones sits in an approachable, small cubicle. The intermixing of people who wouldn’t have met in a traditional classroom pays off. For example, a biology graduate’s thoughts on natural selection influenced a business school student’s theories on company performance …

Whereas Cambridge’s coworking spaces, both academic and industry, tend to be tech-oriented, Somerville’s workspaces are more community-driven. “It’s not about making millions of dollars,” Graney says. She started the Design Annex, a 1,400-foot center housed in Union Square’s former police station, so that designers of all stripes, who would normally work out of their home, would have 24/7 access to a place free from domestic interruptions. Since the Annex is part of a global network, members, for the price of a monthly gym membership, can use workspaces or conference rooms to conduct meetings anywhere in the world.


While Annexers tend to be designers already established in their careers, another building houses a panoply of fledgling businesses. Located in a 4,400-square-foot warehouse set back from the street, Fringe comprises 20 people and 16 companies—ranging from cult brewery Pretty Things Beer & Ale Project to a roof-garden business, Recover Green Roofs. There are coworking days when Fringe opens its brightly colored conference room to people looking for a place to work, and even provides them with free coffee. Businesses pay for space per square foot, and they can partition and decorate their offices to any taste. Custom-bike builders have a utilitarian shop next to a space shared by a video producer and studio photographer, which is overseen by a salvaged piece of Shepard Fairey street art depicting Andre the Giant. Every business is deliberately different, as meticulously curated as if it were part of an art exhibition …

“A business plan is a document investors make you write that they don’t read.”

Steve Blank, The Four Steps to the Epiphany

Based on Alex Osterwalder’s Business Model Canvas, optimized for Lean Startups:

Fast: Compared to writing a business plan which can take several weeks or months, you can outline multiple possible business models on a canvas in one afternoon.

Portable: A single page business model is much easier to share with others which means it will be read by more people and also more frequently updated.

Concise: Lean canvas forces you to distill the essence of your product. You have 30 seconds to grab the attention of an investor over a metaphorical elevator ride, and 8 seconds to grab the attention of a customer on your landing page.

Effective: Whether you’re pitching investors or giving an update to your team or board, Lean Canvas’ built-in presenter tools allow you to effectively document and communicate your progress.

… reverse innovation, este modelo invierte el proceso habitual de creación y comercialización de productos de una empresa, lo que implica desarrollar productos en y para los mercados emergentes y adaptarlos después a las economías más avanzadas. 

"La reverse innovation es el proceso contrario a la glocalización, que parte de una concepción global de un producto para adaptarlo después a un mercado local y que es lo que se ha estado haciendo en Estados Unidos y Europa. Ambas estrategias son necesarias en el seno de las multinacionales porque cada una responde a nichos, situaciones y oportunidades de negocios y mercados diferentes. Es necesario que convivan”

Trust issues in Web service mash–ups, by Kevin Lee, Nicolas Kaufmann and Georg Buss. First Monday, Volume 16, Number 8 - 1 August 2011

Abstract

With the emergence of Web service mash–ups (Web applications that integrate different data sources), online data integration and aggregation is increasingly becoming the online norm for both commercial and non–commercial users. With such widespread adoption of data integration from discrete sources, the question emerges as to whether the resultant mash–up can be considered as trustworthy. This paper explores the concepts behind Web service mash–ups to determine the factors influencing their trustworthiness. The focus is on examining data quality and data assurance issues for both data providers and mash–up consumers.

Last month the OpenNet Initiative published a report that shines light on one of the more sensitive business practices of Western Internet security and filtering companies. These companies – including McAfee (an Intel subsidiary), Websense, and Netsweeper – promote their filtering technologies in the West as tools for parents and schools trying to shield children from online pornography and employers looking to maintain a professional work environment. But they also appear to make their software and URL categorization services available to state-run ISPs and telecoms in Middle Eastern and North African countries, such as Bahrain, UAE, Qatar, Oman, Saudi Arabia, Kuwait, Yemen, Sudan, and Tunisia. These ISPs and telecoms, and the governments behind them, use the software to filter out Internet content that they don’t want their citizens to see.

European public bodies produce thousands upon thousands of datasets every year - about everything from how our tax money is spent to the quality of the air we breathe.

We are challenging designers, developers, journalists, researchers and the general public to come up with something useful, valuable or interesting using open public data.

There are four main strands to the competition:

  • Ideas – Anyone can suggest an idea for projects which reuse public information to do something interesting or useful.
  • Apps – Teams of developers can submit working applications which reuse public information.
  • Visualisations – Designers, artists and others can submit interesting or insightful visual representations of public information.
  • Datasets - Public bodies can submit newly opened up datasets, or developers can submit derived datasets which they’ve cleaned up, or linked together …

[notas del curso]

EL ARTE DEL ANÁLISIS DE DATOS: DE LAS HOJAS DE CÁLCULO A R


Título fallido: Las (malas)artes del análisis de datos …

Juan Freire

Universidade da Coruña

http://juanfreire.net/

Instituto de Humanidades, Artes & Ciências Professor Milton Santos (IHAC)

Universidade Federal da Bahía (UFBA)

Abril 2010

1. INTRODUCCIÓN. ¿Para qué el análisis de datos?

Análisis cuantitativo … a veces de información cualitativa

a) Hipótesis

b) Exploración de patrones (relaciones desconocidas entre variables)

http://en.wikipedia.org/wiki/Data_analysis

Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.

Caso: storylines in TV series

http://ubergrid.tumblr.com/post/528551058

Caso: Cultural Analytics ( Lev Manovich, Sofware Studies)

http://lab.softwarestudies.com/

http://www.flickr.com/photos/culturevis/

Manga research:

http://www.flickr.com/photos/culturevis/sets/72157623691111589/

http://lab.softwarestudies.com/2010/02/1000000-manga-pages-visualization.html

"The end of science" (the data deluge makes the scientific method obsolete)

http://www.wired.com/wired/issue/16-07

2. Antes de iniciar el diseño del análisis de datos

a) Fuentes de información disponibles

b) Hipótesis a priori

c) Posibles patrones en datos

3. Fuentes de información:

- Unidades de información (casos)

- Contenido (variables)

- Tipos de contenido (codificación): cuantitativo, semi-cuantitativo (ordenado), categórico, 1/0

4. Fases del análisis de datos:

a) Diseño de bases de datos: variables (codificación), casos. Ejemplos de bases de datos

b) Exploración de datos

c) Depuración de datos: errores, outliers, redefinición de variables

d) Análisis estadísticos - Visualización de datos

5. Análisis exploratorio de datos

http://en.wikipedia.org/wiki/Exploratory_data_analysis

Exploratory data analysis (EDA) is an approach to analysing data for the purpose of formulating hypotheses worth testing, complementing the tools of conventional statistics for testing hypotheses…. It was so named by John Tukey to contrast with Confirmatory Data Analysis

… Tukey held that too much emphasis in statistics was placed on statistical hypothesis testing (confirmatory data analysis); more emphasis needed to be placed on using data to suggest hypotheses to test.

Más en: http://www.itl.nist.gov/div898/handbook/eda/section3/eda3.htm

EDA emphasizes graphical techniques while classical techniques emphasize quantitative techniques. In practice, an analyst typically uses a mixture of graphical and quantitative techniques.

- Depuración de datos

- Visualización de patrones: sugerir hipótesis

- Planificar la obtención de nueva información

6. Gráficos estadísticos

http://en.wikipedia.org/wiki/Graphical_technique

Statistical graphics, also known as graphical techniques, are information graphics in the field of statistics used to visualize quantitative data.

- Box plots (o box-and-whisker diagram): http://en.wikipedia.org/wiki/Box_plot

- Histogramas: http://en.wikipedia.org/wiki/Histogram

- Pareto chart: http://en.wikipedia.org/wiki/Pareto_chart

- Scatter plot: http://en.wikipedia.org/wiki/Scatter_plot

Usos más sofisticados:

Correlation scatter-plot matrix for ordered-categorical data:

http://www.r-statistics.com/2010/04/correlation-scatter-plot-matrix-for-ordered-categorical-data/

http://www.wired.com/science/discoveries/magazine/16-07/pb_visualizing

A visualization of thousands of Wikipedia edits that were made by a single software bot. Each color corresponds to a different page.
Image: Fernanda B. Viégas, Martin Wattenberg, and Kate Hollenbach 

7. Ejemplos de análisis exploratorio: Detección de outliers

http://www.itl.nist.gov/div898/handbook/prc/section1/prc16.htm (Engeneering Statistics Handbook):

An outlier is an observation that lies an abnormal distance from other values in a random sample from a population. In a sense, this definition leaves it up to the analyst (or a consensus process) to decide what will be considered abnormal. Before abnormal observations can be singled out, it is necessary to characterize normal observations.

Detección de errores y/o outliers: boxplots, scatter platos:

http://www.itl.nist.gov/div898/handbook/eda/section3/scattera.htm

http://www.itl.nist.gov/div898/handbook/eda/section3/boxplot.htm

8. Algunos ejemplos de análisis de datos y visualización científica en arte

Harun Farocki. Deep Play

http://www.farocki-film.de/deepeg.htm

http://www.flickr.com/photos/architektur/sets/72157600380226624/

Ben Fry. Cartografías genéticas (Processing)

Bases de datos genómicas: http://genome.ucsc.edu/cgi-bin/hgTables

http://benfry.com/aasd/

http://acg.media.mit.edu/people/fry/genocarto.html

http://benfry.com/genomevalence/

Análisis y visualización de redes tróficas

http://biology.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pbio.0060102&ct=1

Pacific Ecoinformatics and Computational Ecology Lab

http://foodwebs.org/index.html

9. Software para análisis de datos y visualización

- bases de datos

- hojas de cálculo

- paquetes de gráficos (+ estadística básica)

- paquetes estadísticos (+visualización + lenguaje de programación)

Elección de software y curvas de aprendizaje: comparación de hojas de cálculo y R

10. Sistemas de gestión de bases de datos

http://en.wikipedia.org/wiki/Database_management_system

- Open Office.org Base

- Microsoft Office Access

- mySQL …

OpenOffice.org Base Project: http://dba.openoffice.org/

http://www.openoffice.org/product/base.html

http://en.wikipedia.org/wiki/OpenOffice.org_Base

11. Hojas de cáculo (Planilha eletrônica)

http://en.wikipedia.org/wiki/Spreadsheet

Animación del funcionamiento de un hoja de cálculo:

http://upload.wikimedia.org/wikipedia/en/2/23/Spreadsheet_animation.gif

- Open Office.org Calc

- Gnumeric

- Microsoft Office Excel

- Google Docs

OpenOffice.org Calc Project: http://sc.openoffice.org/

http://www.openoffice.org/product/calc.html

http://en.wikipedia.org/wiki/OpenOffice.org_Calc

Gnumeric

http://projects.gnome.org/gnumeric/

http://en.wikipedia.org/wiki/Gnumeric

Análisis estadístico con Gnumeric:

http://projects.gnome.org/gnumeric/doc/chapter-stat-analysis.shtml

Correlation Tool:

http://projects.gnome.org/gnumeric/doc/correlation-tool.shtml

The RGnumeric Package: a package that allows R to be used as a plugin for Gnumeric

http://www.omegahat.org/RGnumeric/

12. Algunos usos de las hojas de cálculo:

* Gestión e importación de datos

* Tablas de dinámicas (“piloto de datos”)

* Correlación y regresión

13. Paquetes estadísticos

- SAS: Business Analytics and Business Intelligence Software. Windows. $$$$$$$. GUI

http://www.sas.com/software/sas9/

- IBM SPSS Statistics (antes: Statistical Package for the Social Sciences). Windows, Mac, Linux. $$$. GUI

http://www.spss.com/statistics/

- Statistica. Windows, Mac. $$$. GUI

http://www.statsoft.com/

- R. Software libre. Linux, Mac, Windows. Línea de comandos

14. The R Project for Statistical Computing

http://www.r-project.org/

The Comprehensive R Archive Network: http://cran.es.r-project.org/

http://pt.wikipedia.org/wiki/R_%28linguagem_de_programa%C3%A7%C3%A3o%29

Paquetes en R:

- Colecciones de funciones, datos y código

- compilado

- formato estandarizado

Extensiones de la interfaz:

- Windows, Mac: Tienen una GUI que te deja hacer bastantes cosas con menús

- Edición de Scripts:

* Interno a la GUI

* Externo: Tinn-R, R-WinEdt, o mediante plugins

Editores de scripts: http://www.sciviews.org/_rgui/projects/Editors.html

Editores visuales para R (sólo para Windows):

Tinn-R: : http://www.sciviews.org/Tinn-R/

"Tinn-R is free, simple but efficient replacement for the basic code editor provided by Rgui"

RWinEdit: http://cran.r-project.org/web/packages/RWinEdt/index.html

Visualización y gráficos (ggplot2): http://had.co.nz/ggplot2/

ggplot2 is a plotting system for R, based on the grammar of graphics, which tries to take the good parts of base and lattice graphics and none of the bad parts. It takes care of many of the fiddly details that make plotting a hassle (like drawing legends) as well as providing a powerful model of graphics that makes it easy to produce complex multi-layered graphics.


Sunlight Labs is pleased to announce our latest contest — “Design for America.” This 10 week long design and data visualization extravaganza is focused on connecting the talents of art and design communities throughout the country to the wealth of government data now available through bulk data access and APIs, and to help nurture the field of information visualization. Our goal is simple and straightforward — to make government data more accessible and comprehensible to the American public. We hope to enliven and engage new communities — just as we did with Apps for America 1 and 2 — as partners and participants in making government information more engaging to the American public. Our contest will end with a public announcement of the winners at Gov 2.0 Expo here in Washington, DC in May, in partnership with O’Reilly andTechWeb, and with a public gallery showing of the winners.

There’s an “artist” inside all of us so we’re creating multiple entry categories so that contestants have an opportunity to show off their skills wherever they are most comfortable. There’s room for all kinds of folks to participate — artists, data visualizers, specialists in info graphs and usability experts — to name a few.

Meet the geodesigner - Architect Magazine
Loosely defined as the integration of geographic analysis and tools into the design process, the term “geodesign,” while not proprietarily linked to ESRI, is viewed as part of the company’s lexicon by the geospatial community, broadly composed of urban planners, cartographers, geographers and other social scientists, and emergency response and military analysts, among others. Geodesign, as Dangermond sees it, is shorthand for the complex interrelationship of spatial data and architecture. It is the interface between land use, census blocks, traffic patterns, air quality tables, and any other data set, on the one hand, and the process of building—site planning, conceptual design, programming, and construction drawings—on the other.

Meet the geodesigner - Architect Magazine

Loosely defined as the integration of geographic analysis and tools into the design process, the term “geodesign,” while not proprietarily linked to ESRI, is viewed as part of the company’s lexicon by the geospatial community, broadly composed of urban planners, cartographers, geographers and other social scientists, and emergency response and military analysts, among others. Geodesign, as Dangermond sees it, is shorthand for the complex interrelationship of spatial data and architecture. It is the interface between land use, census blocks, traffic patterns, air quality tables, and any other data set, on the one hand, and the process of building—site planning, conceptual design, programming, and construction drawings—on the other.

About

The aim is social change. The path is regional collaboration. The focus is local.

Technology is changing our relationship with government. Not so long ago government made decisions with little public input. Those days are gone. Today, information technology has redefined the structure and authority of government. The problems our communities face are beyond the capacity of government to resolve alone. Cooperation, collaboration and openness are no longer questions of opportunity; they are essential means of conducting our community’s business effectively. Every citizen can be an active participant in reshaping their world. WE are the government.

The CivicApps.org site aims to encourage every citizen to be an active participant by putting the data in their hands. The CivicApps.org site was developed to source, profile, and accelerate innovative ideas using Web and mobile technologies. The aim is social change. The path is regional collaboration. The focus is local.

— San Francisco Open Source Policy

— Vancouver Open Source & Open Data Policy

— Portland Open Source & Open Data Policy

… You can learn more and contribute to the creation of resources for open cities with the nascent OpenMuni project.