Statistics – Probability Density Function and Z Table

It turns out that normally distributed values are quite important in statistics. Not only because the pattern is remarkably common, the central limit theorem enables statisticians to infer conclusions about how a given treatment will affect a given population. To make such inferences, we need to learn about the Probability Density Function and a useful shortcut: […]

Meetup Members Analysis

I recently started a group on Meetup.com for folks interested in computer programming. This was my first time doing so, and I had not worked out where the group would actually meet up. The first step to fining a meet-up spot was to look at the membership role and find where the individuals were located. I […]

Statistics – Standard Deviation

Most people are familiar with the concepts of the mean, median, and mode. They are measures of the central tendency of a value that has measured in a given population. They tell us, in different ways, about the value of an attribute at the heart of the population, rather than at the positive or negative […]

Introduction to NumPy

Lately, I’ve been studying statistics and data analysis. I have beforehand knowledge of the Python programming language, so when looking at the two most widely used programming tools applied in this domain, Pandas and R, I chose Pandas – a software library for Python. Pandas uses another library in the construction of it’s data structures […]

Pandas Basics I: Series and DataFrames

What is Pandas? Pandas is a free software (software libre) data analysis library for the Python programming language. The library provides analysts and programmers data structures optimized for working with large data sets, and methods for examining and manipulating that data. It uses another free software library, NumPy, for underlying data structures, and Pyplot to generate plots […]

SQL to Pandas Translation

I’m experienced in working with SQL for data wrangling and analysis, but have recently started using the Python Pandas library for similar tasks. The thing I really like about Pandas is the ability to (combined with matplotlib) to plot/visualize the data once it’s been successfully curated. Coming from the SQL background, I’ve been approaching problems […]