StatLab Articles

Analysis of Ours to Shape Comments, Part 5

Introduction

In the penultimate post of this series, we’ll use some unsupervised learning approaches to uncover comment clusters and latent themes among the comments to President Ryan’s Ours to Shape website.

The full code to recreate the analysis in the blog posts is available on GitHub.

Ours to Shape, quanteda, R, text analysis, text mining, Michele Claibourn

Analysis of Ours to Shape Comments, Part 4

Introduction

We're still analyzing the comments submitted to President Ryan’s Ours to Shape website.

In the fourth installment of this series (we’re almost done, I promise), we’ll look at the sentiment – aka positive-negative tone, polarity, affect – of the comments to President Ryan’s Ours to Shape website.

Ours to Shape, quanteda, R, text analysis, text mining, Michele Claibourn

Analysis of Ours to Shape Comments, Part 3

Introduction

To recap, we’re exploring the comments submitted to President Ryan’s Ours to Shape website (as of December 7, 2018).

Ours to Shape, quanteda, R, text analysis, text mining, Michele Claibourn

Analysis of Ours to Shape Comments, Part 2

Introduction

In the last post, we began exploring the comments submitted to the Ours to Shape website. We looked at the distribution across categories and contributors, the length and readability of the comments, and a few key words in context. While I did more exploration of the data than reported, the first post gives a taste of the kind of dive into the data that usefully proceeds analysis.

Ours to Shape, quanteda, R, text analysis, text mining, Michele Claibourn

Analysis of Ours to Shape Comments, Part 1

Introduction

As part of a series of workshops on quantitative analysis of text this fall, I started examining the comments submitted to President Ryan’s Ours to Shape website. The site invites people to share their ideas and insights for UVA going forward, particularly in the domains of service, discovery, and community.

Ours to Shape, quanteda, R, text analysis, text mining, Michele Claibourn

How to Apply a Graduated Color Symbology to a Layer Using Python for QGIS 3

I was recently working on a project in QGIS 3 with a member of UVA Health's Oncology department. This person wanted to take a set of patient data (after identifying info had been removed) and after doing some other stuff, apply a graduated color scheme to the results, shading them from light to dark based on intensity.

You can find a sample dataset for this project here:

https://github.com/epurpur/PyQGIS-Scripts/blob/master/TestZipCodes.zip

Python, visualization, QGIS, Erich Purpur

How to Use the Field Calculator in Python for QGIS 3

Recently, I have taken the dive into python scripting in QGIS. QGIS is a really nice open source (and free!) alternative to ESRI's ArcGIS. While QGIS is a little quirky and generally not quite as user friendly as ArcGIS, it still provides nearly the same functionality. Personally, I've become a fan of it and now have even taught a short, 1 credit course in the University of Virginia's Batten School of Public Policy titled: GIS for Public Policy.

Python, data wrangling, QGIS, Erich Purpur

A Beginner's Guide to Text Analysis with quanteda

A lot of introductory tutorials to quanteda assume that the reader has some base of knowledge about the program's functionality or how it might be used. Other tutorials assume that the user is an expert in R and on what goes on under the hood when you're coding. This introductory guide will assume none of that. Instead, I'm presuming a very basic understanding of R (like how to assign variables) and that you've just heard of quanteda for the first time today.

quanteda, R, text analysis, text mining, Leah Malkovich

Assessing Type S and Type M Errors

The paper Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors by Andrew Gelman and John Carlin introduces the idea of performing design calculations to help prevent researchers from being misled by statistically significant results in studies with small samples and/or noisy measurements.

R, power analysis, statistical methods, type m error, type s error, Clay Ford

Interpreting Log Transformations in a Linear Model

Log transformations are often recommended for skewed data, such as monetary measures or certain biological and demographic measures. Log transforming data usually has the effect of spreading out clumps of data and bringing together spread-out data. For example, below is a histogram of the areas of all 50 US states. It is skewed to the right due to Alaska, California, Texas and a few others.

R, linear regression, statistical methods, log transformations, diagnostic plots, Clay Ford

Research Data Services

Want updates in your inbox? Subscribe to our monthly Research Data Services Newsletter!

Analysis of Ours to Shape Comments, Part 5

Introduction

Analysis of Ours to Shape Comments, Part 4

Introduction

Analysis of Ours to Shape Comments, Part 3

Introduction

Analysis of Ours to Shape Comments, Part 2

Introduction

Analysis of Ours to Shape Comments, Part 1

Introduction

How to Apply a Graduated Color Symbology to a Layer Using Python for QGIS 3

How to Use the Field Calculator in Python for QGIS 3

A Beginner's Guide to Text Analysis with quanteda

Assessing Type S and Type M Errors

Interpreting Log Transformations in a Linear Model

Research Data Services

Subscribe

Using the Library

About

Contact us