---
title: "How to use this package to support your IPEDS reporting"
description: "This package is will ingest your institutional data, aggregate/reshape that data, and produce a text file which is compatible with the IPEDS report upload requirements."
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{How to use this package to support your IPEDS reporting}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

```

A few things to note:

* To ensure you have the most current version of this package, we suggest you check for an updated version and reinstall (as needed) during each collection window.
* To use this package, you'll need to prepare your data as indicated in vignettes. The vignettes can all be viewed through the [package website](https://alisonlanski.github.io/IPEDSuploadables/).  
* IPEDS upload files must be 100% of the requested information for a particular report. In other words, if a report has parts A, B, and C, you must upload all parts in a single file or the upload will fail.  *Note: I have heard rumors that you can now successfully upload an incomplete file. I have not confirmed this for myself.*
* To create an upload-ready file based on your own data preparation, see [How to produce other key-value uploads](howto_use_autoformat.html)

## Step 0: Install/Load the package

```{r setup, eval=FALSE}
#official install from CRAN
install.packages("IPEDSuploadables")

#development install from the github repo
#use this if you want to pull in changes before they reach cran or need to use other code branches

#development install option 1
remotes::install_github("AlisonLanski/IPEDSuploadables")

#development install option 2
devtools::install_github("AlisonLanski/IPEDSuploadables")
```

```{r load, eval = TRUE}
#load the packages
library(IPEDSuploadables)
```

## Step 1: Prep your institutional data

The package requires specific column names and values to work correctly. 
Each IPEDS report has a vignette that spells out requirements and also has sample data you can examine.

**To create files for non-supported reports, see [How to produce other key-value uploads](howto_use_autoformat.html)**

| IPEDS Report | Required Dataframes | Sample Data |
|:--------|:--------|:---------|
| 12 Month Enrollment  | [students & instructional activity](setup_for_12monthenrollment.html)  |  e1d_students, e1d_instr  |
| Completions  | [students & extra cips](setup_for_completions.html)  | com_students, com_cips  |
| Fall Enrollment  | [students & retention aggregate](setup_for_fallenrollment.html)  |  ef1_students, ef1_retention  |
| Graduation Rates  | [students](setup_for_gr.html)  |  gr_students  |
| Graduation Rates 200   | [students](setup_for_gr200.html)  |  gr200_students  |
| Human Resources  | [staff](setup_for_hr.html) | hr_staff  |
| Outcome Measures  |  [students](setup_for_om.html)  |  om_students  |

  
You can call the sample data as desired to explore it
```{r dummy_data, eval = TRUE}
#dataframe of student information for 12 month enrollment
head(e1d_students)
```


## Step 2: Read in your data and produce your IPEDS report

Each IPEDS report has a single function which will produce all required sub-parts in a single file for upload.  

* Some reports require 2 dataframes  
* Use the popup box to choose the folder where the report should be saved.  
* The upload file will be saved as a txt file with a date-stamp.  Rerunning the script on the same day to the same location will overwrite your original file.

Example:
```{r produce_report, eval=FALSE}
#full export using sample data
produce_e1d_report(df = e1d_students, hrs = e1d_instr, part = "ALL")
```


A message will be display the location of your file when it has been processed.

### Note about the gender collection updates for 2024-2025 reporting cycle
  
*Graduation Rates no longer has a gender detail requirement (it was removed by IPEDS). Therefore, the starting data requirements, sample data, and scripts have been updated to remove the `ugender` flag and the `GenderDetail` column.*

Completions, 12 Month Enrollment, and Fall Enrollment still DO require reporting on "another gender" and "unknown gender". This means you continue to need `GenderDetail`, `ugender`, and `ggender` information as appropriate for each survey. Scripts are set to assume that you ARE able to report on "another gender" in this reporting cycle.
They will automatically mask data if you have fewer than 5 students in the "another gender" category.
Therefore, as long as you have the capacity to report Another Gender, you should keep the default of `TRUE` even if you have no students or very few students in that category.

  
If you are **NOT** able to collect/report on students of "another gender" this cycle

* Set `ugender = FALSE` (for undergraduates) in the produce function
* Set `ggender = FALSE` (for graduates) in the produce function

Example:
```{r gender_example, eval = FALSE}
#able to report undergraduate "another gender" but NOT able to report graduate "another gender"
produce_com_report(df = com_students, extracips = com_cips, ggender = FALSE)
```


### Notes for specific reports

Completions

* The extra cips dataframe is optional and can be left empty if your school has student completions in all cip/award level combinations previously reported


Fall Enrollment

* Your student-faculty ratio number is collected via pop-up box.  Type a whole number when prompted.

* The main function auto-detects the IPEDS submission year and adjusts the sections of cip-based enrollment, student age, and student residence state appropriately to be in or out of your report.  If you would like to report data in optional years, change the `include_optional` flag to `TRUE`. 

Human Resources

* The script requires one dataframe with all current employees: continuing and new hires. New hires are designed through a flag and should not have a second entry.  
  
Admissions  
  
* A sample university-specific script is shown in the vignette [How to produce other key-value uploads](howto_use_autoformat.html) as an example of the data preparation process. The sample code includes data masking for fewer than five students of Another Gender and for exam percentiles with fewer than five SAT or ACT scores.


## Step 3: Upload your IPEDS report
  
1. Navigate to the IPEDS submission portal and log in.  

2. Select the appropriate report (for example: "12 Month Enrollment"), select "key-value pair", then browse to your .txt file and upload.  

3. Click through the survey screens to review results and verify submission accuracy

4. Continue your standard process to check edits, enter metadata, lock, and submit the survey.


## Troubleshooting

#### You need to make a data correction after uploading
Update your base data, rerun the package function, and re-upload to IPEDS.  You can upload as many times as you want to any particular survey.  
Your keyholder may be able to edit directly in the website, but this method is not recommended since it is not reproducible.

If you verify that your data and dataframe(s) are correct, but IPEDS still shows incorrect aggregation, please contact the package authors by adding a [GitHub issue](https://github.com/AlisonLanski/IPEDSuploadables/issues) or by emailing (if you search online for *Alison Lanski* and *Notre Dame*, you can get to a valid email address). We can provide the quickest help if you share an example with sample data, expected output, and actual output.

#### IPEDS rejects your upload file or accepts the file but provides an error message
Common issues include:

* Your data has disallowed values
* Your upload is missing a sub-part of the report
* IPEDS is expecting particular information based on previous years of reporting which is inconsistent or missing

*For example, IPEDS wants all previous CIP/Award Level combinations reported on the current completions upload file. If you have no students in those categories, add any flagged CIP/Award combinations to your extra_cips dataframe, rerun, and reupload to remove this error message*

#### How the package can help you troubleshoot

A few functions have built-in data quality checks which will alert you of disallowed values.  We are working to add more automated checks.  

If you want to examine a single report sub-part in depth, you can create a separate text file with only those values.

Example:
```{r produce_subreport, eval = FALSE}
#if you only want to look at 12 month enrollment part B
produce_e1d_report(e1d_instr, part = "B")
```

You can also produce a csv version of the part's text file, for more readable output (still using IPEDS upload-required column names and values)  
  
Example:
```{r produce_prettyfile, eval = FALSE}
#text files make my eyes bleed! let's use a csv
produce_e1d_report(hrs = e1d_instr, part = "B", format = "readable")
```

#### IPEDS has changed a survey's requirements, but you need to rerun an old report
At the end of every IPEDS collection cycle, that year's package code will be saved into a branch on GitHub.  To use an old version, install the package from the branch.

```{r branch, eval=FALSE}
#install, picking an acceptable year range
devtools::install_github(repo = "AlisonLanski/IPEDSuploadables", ref = "reporting_year_2022-2023")
```

When you have finished running your code, reinstall the current version of the package by removing the "ref" argument.