Skip to content

ubvu/jstor-collections

Repository files navigation

Journal Overlap Analysis for JSTOR Collections

This repository contains a Python-based analysis to assess journal coverage overlap between JSTOR collections and other library collections, based on subscription metadata.

The input data represents journal subscriptions over time, where each row corresponds to a journal × period × collection triple.

Using journal identifiers (print and online ISSNs), the analysis examines whether the same journal-period appears in other JSTOR collections or in non-JSTOR collections, with full, partial, or complementary temporal coverage.

Special attention is given to journals that would be lost if JSTOR access were reduced or cancelled.


Input Data

A TSV file containing all journal subscriptions held by the library. Each row must represent a single subscription period in a specific collection.

Required columns:

Column name Description
oclc_collection_name Name of the collection (used to identify JSTOR and sub-collections)
publication_title Journal title (may vary slightly across collections)
print_identifier Print ISSN (used for matching)
online_identifier Online ISSN (used for matching)
date_first_issue_online Start of coverage period
date_last_issue_online End of coverage period (missing = assumed ongoing)

Notes:

  • Journals are matched using both identifiers, with fallback logic if identifiers are swapped across sources.
  • Only records marked as "fulltext" are used for the analysis.
  • Missing end dates are treated as coverage through 2026.

Methodology

Period overlap classification

For each journal-period pair, matches are classified as:

  • Full: the comparison period completely covers the source period
  • Partial: periods overlap, but coverage is incomplete
  • Complementary: no overlap, but periods are adjacent or disjoint

Overlap percentages (share of years covered) are also calculated.


Analysis

1. Overall co-occurrences

For each JSTOR journal-period pair (excluding Books):

  • Check for matches in all other collections
  • Flag whether it has:
    • any match
    • at least one full match
    • at least one partial match
    • at least one complementary match
  • Count total matches by overlap type

2. JSTOR inter-collection overlap

Analyzes overlap between JSTOR sub-collections themselves.

For each pair of JSTOR collections:

  • Counts how many unique journal-periods co-occur
  • Distinguishes full, partial, and complementary coverage

3. JSTOR vs non-JSTOR overlap

For each JSTOR collection:

  • Identifies journals that also appear in non-JSTOR collections
  • Classifies overlap type

4. Detailed drill-down table

Creates a fully expanded match table intended for interactive use in Power BI.

Users can select a specific journal-period in a JSTOR collection and see:

  • All other collections where it appears
  • Periods covered
  • Overlap type and percentage

5. Unique and at-risk journals

Identifies two mutually exclusive categories:

  • No matches: journals not present in any other collection
  • JSTOR matches only: journals present in JSTOR collections but not outside JSTOR

6. Enrichment: VU publications and SJR

For at-risk journals, the analysis can be enriched with:

  • Number of VU publications (last 10 years) from PURE as a proxy for local usage
  • SJR indicator as a proxy for journal importance/prestige

Matching is performed using multiple ISSN columns with priority rules.


Outputs

When run as a script, the analysis produces a single Excel workbook:

journal_overlap_analysis_results.xlsx

with the following sheets:

Sheet name Description
Overall JSTOR journal-periods with co-occurrence flags and counts
JSTOR_InterCollection Overlap between JSTOR sub-collections
JSTOR_vs_NonJSTOR Overlap between JSTOR and non-JSTOR collections
Drilldown Detailed journal-period match table
Unique_with_VU_Pubs At-risk journals enriched with VU publications and SJR
(or) Unique At-risk journals without enrichment

About

Overlap analysis for the JSTOR collections

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published