Library for scoring questionnaires

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 5 - Production/Stable
Environment
- Console
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Natural Language
- English
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Scientific/Engineering :: Information Analysis
- Software Development :: Libraries :: Python Modules

Project description

Scorify: A simple tool for scoring psychological self-report questionnaires.

Background

Many psychology studies use one or more self-report questionnaires to understand their participants. These responses go into CSV files with one question per column, one participant per row.

Scoring these files is a bunch of work. Oftentimes, many questionnaires (or sub-scales) are included in one CSV file. Often, half of the questions are "reverse-scored" to combat the tendancy people have to agree with questions. Scoring these files usually means spending a whole bunch of time in Excel, and no one likes doing that.

Scorify aims to fix this.

Installation

scorify requires Python 3.5.

pip install scorify

should have you set up.

Examples

See examples/ for some test files. To run the neurohack data and scoresheet, do something like:

score_data neurohack_scoresheet.csv neurohack_April+2,+2019_11.05.csv

Getting started

Given an example CSV file, let's say you want to score 5 columns, the answers can be 1 to 5, where the third and fifth are reversed.

| ppt | happy1 | happy2 | happy3 | happy4 | happy5 | | 3001 | 1 | 2 | 1 | 3 | 4 | | 3002 | 4 | 1 | 5 | 1 | 2 | | 3003 | 1 | 3 | 2 | 3 | 1 | | ... |

Create a scoresheet that looks like:

A	B	C	D
layout	header
layout	data

transform	normal	map(1:5,1:5)
transform	reverse	map(1:5,5:1)

score	ppt
score	happy1	happy	normal
score	happy2	happy	normal
score	happy3	happy	reverse
score	happy4	happy	normal
score	happy5	happy	reverse

measure	happy	mean(happy)

Then you call score_data with that scoresheet and datafile, like:

score_data scoresheet.csv datafile.csv

Your output just goes to STDOUT, and you should see it renaming columns. To save the output if it looks good, just pipe it to a file:

score_data scoresheet.csv datafile.csv > output.csv

Other common operations

Excluding participants

If some participant data is particularly messy, you can exclude it using your scoresheet like this:

A	B	C
exclude	ppt_id_column_name	3001

Keeping second row headers

If your question headers have a second row with verbose question text in them, you can keep that in the scored data by adding a layout keep instruction:

layout header
layout keep
layout data

Repeat the layout keep instruction if you want to keep more than one row.

Scoresheet reference

The main input to scorify is a comma or tab-delimited "scoresheet" that has many rows and four columns. The first column tells what kind of command the row will be, and will be one of: layout, exclude, transform, score, or measure.

layout

The layout section tells scorify what your input data looks like. It must contain a header and data, but skip and keep are also valid. data tells scorify that the rest of your input file is data. So:

layout header
layout skip
layout data

would tell scorify to expect a header row, skip a line, and then read the rest of the file as data.

layout header
layout keep
layout data

would result in scorify expecting a header row, keeping the next line as-is, and reading the rest of the file as data.

rename

The rename section renames a header column, and looks like:

rename original_name new_name

Columns can only be renamed once, and must use a new, unique name. You must use the column's new name everywhere in the scoresheet.

exclude

The format of an exclude line is:

exclude column value

which will, as you might expect, exclude rows where column == value.

transform

Sometimes, you'll want to reverse-score a column or otherwise change its value for scoring. And you'll want to give that some kind of sane name. Transforms let you do this. They look like:

transform name mapper

Right now, you can apply two transformations.

`map()`

A linear mapping. Example:

transform reverse map(1:5,5:1)

which will map the values 1,2,3,4,5 to 5,4,3,2,1. This will happily map values outside its input domain.

`discrete_map()`

A mapping for discrete values. Useful to map a numbers to human-meaningful values.

transform score_gender discrete_map("1":"f", "2":"m")

Unmapped values will return a blank.

This transform can be useful when combined with join() (below) to combine an array of checkboxes into one column.

`passthrough_map()`

Like discrete_map(), though unmapped values will be unchanged. So, if you have:

transform score_gender passthrough_map("1":"f", "2":"m")

a value of "999" will still be "999".

score

The score section is where you tell scorify which columns you want in your output, what measure (if any) they belong to, and what transform (again, if any) you want to apply. These look like

score column measure_name transform

measure_name and transform are both optional. So, to reverse score (using the reverse we defined up above) a column called happy_1 and add it to the happy measure, use:

score happy_1 happy reverse

You can optionally pass a 5th value, which will define the output column name:

score happy_1 happy reverse ReverseHappy1

measure

The measure section computes aggregate measures of your scored data. These lines look like:

measure final_name aggregator(measure_1, measure_2, ..., measure_n)

We support the following aggregators:

`mean()`

As you might expect, this calculates the mean of the measure or measures listed. Example:

measure happy mean(happy)

If any values in the measures are non-numeric, returns NaN.

`mean_imputed()`

Computes the mean of the measure. However, if any of the values in the measures are non-numeric, this fills in the mean of the numeric values. For example, mean_imputed(1, '', 3, 5) is 3.

`sum()`

Computes the sum fo the listed measures. Example:

measure sad sum(sad)

If any values in the measures are non-numeric, returns NaN.

`sum_imputed()`

Computes the sum of the measure. However, if any of the values in the measures are non-numeric, this fills in the mean of the numeric values. For example, sum_imputed(1, '', 3, 5) is 12.

`imputed_fraction()`

The fraction of the data that is non-zero and would have a value imputed for it. imputed_fraction(1, '', 3, 5) is 0.25.

`join()`

join() is a little trickier. It collects all the non-blank values in the listed measures, and joins them with the | character. Useful if you have a set of values selected by checkbox. For example, if you had three measures that would either be blank or not for things participants might endorse, you could collate them into one column with:

measure liked_pets join(likes_cats, likes_dogs, likes_horses)

If a participant had cats for likes_cats and horses for likes_horses, you'd get:

cats|horses

`ratio()`

ratio(a, b) will compute the ratio of two columns; in other words: a / b. Notably, this works on other measures, so you can take the ratio of sums or means. In those cases, the ratio line needs to come after the other measures' lines do.

`min()`

min(measure_1, measure_2) will output the minimum numeric value in the given measures. Non-numeric values will cause NaN.

`max()`

max(measure_1, measure_2) will output the maximum numeric value in the given measures. Non-numeric values will cause NaN.

Complete example

If you take a scoresheet that looks like:

A	B	C	D
layout	header
layout	data

exclude	PPT_COL	bad_ppt1
exclude	PPT_COL	bad_ppt2

transform	normal	map(1:5,1:5)
transform	reverse	map(1:5,5:1)

score	PPT_COL
score	HAPPY_Q1	happy	normal
score	SAD_Q1	happy	normal
score	HAPPY_Q2	happy	reverse

measure	happy_score	mean(happy)
measure	sad_score	mean(sad)
measure	happiness_ratio	ratio(happy_score, sad_score)

and run it on data that looks like:

PPT_COL	EXTRA	HAPPY_Q1	SAD_Q1	HAPPY_Q2
ppt1	foo	4	2	2
ppt2	bar	2	5	5

... you'll get output like:

PPT_COL	HAPPY_Q1: happy	SAD_Q1: sad	HAPPY_Q2: happy	happy_score	sad_score	happiness_ratio
ppt1	4	2	3	3.5	2	1.75
ppt2	2	5	1	1.5	5	0.3

Credits

Scorify uses several excellent libraries:

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 5 - Production/Stable
Environment
- Console
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Natural Language
- English
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Scientific/Engineering :: Information Analysis
- Software Development :: Libraries :: Python Modules

Release history Release notifications | RSS feed

This version

0.9.3

May 21, 2020

0.9.2

May 18, 2020

0.8.1

Oct 18, 2018

0.8.0

Apr 30, 2018

0.7.0

Nov 29, 2017

0.6.0

Jun 21, 2017

0.5.0

Apr 27, 2017

0.4.0

Apr 3, 2015

0.3.0

Oct 30, 2014

0.2.1

Feb 17, 2014

0.2

Feb 17, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scorify-0.9.3.tar.gz (24.0 kB view hashes)

Uploaded May 21, 2020 Source

Built Distribution

scorify-0.9.3-py3-none-any.whl (27.6 kB view hashes)

Uploaded May 21, 2020 Python 3

Hashes for scorify-0.9.3.tar.gz

Hashes for scorify-0.9.3.tar.gz
Algorithm	Hash digest
SHA256	`eb3d5f6943e0f78aeac25e176ee3e92957987cb6f6304aeada0b1855ea2f7fb4`
MD5	`f1d052b240a36397c24865041071b666`
BLAKE2b-256	`231a40db4ff759328b60de235c49be45b3d6b11103e250ebaf0651baf6481c55`

Hashes for scorify-0.9.3-py3-none-any.whl

Hashes for scorify-0.9.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6d8a878af77acaba84e4764c1c29dada6e787e4c98dd877439ea7e2fdb8c7a5d`
MD5	`8fb4190537ea028677d267426ec8863f`
BLAKE2b-256	`0965a67997a85316e3601eb3dc827d9bcdd7c8840c6c8e58f5b6d040453e0410`

scorify 0.9.3

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Scorify: A simple tool for scoring psychological self-report questionnaires.

Background

Installation

Examples

Getting started

Other common operations

Excluding participants

Keeping second row headers

Scoresheet reference

layout

rename

exclude

transform

map()

discrete_map()

passthrough_map()

score

measure

mean()

mean_imputed()

sum()

sum_imputed()

imputed_fraction()

join()

ratio()

min()

max()

Complete example

Credits

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

`map()`

`discrete_map()`

`passthrough_map()`

`mean()`

`mean_imputed()`

`sum()`

`sum_imputed()`

`imputed_fraction()`

`join()`

`ratio()`

`min()`

`max()`