Kavi® Members Help

Appendix A. Preparing CSV Files for Upload Data

Overview

The Upload Data tool accepts files in comma-separated values (CSV) file format (i.e., files with a .csv extension). These data files are used to perform batch adds, edits or delete operations. CSV files can be viewed and manipulated through a spreadsheet application or text editor. This document provides tips on working with CSV files, maintaining file structure and general best practices advice for data preparation.

Back to top

CSV File Structure

CSV templates or data files can be downloaded from links at the top of the Upload Data tool. The first row of the template or data file contains column headings. Each subsequent row corresponds to a record in the database. When a CSV template is downloaded, it only contains the column headings. Since templates are used to add new records, new rows will be added for each record. When a CSV data file is downloaded, the first row contains the column heading and subsequent rows contain data records that already exist in the database. The records in these rows can be edited or deleted.

In a CSV file, each row contains an ordered sequence of column headings or values separated by commas. The commas are used to maintain file structure. Each comma in the first row (which contains the column headings) delimits a column heading and a place in the ordered sequence of columns. Commas in subsequent rows also maintain the sequence of ordered columns, so the first value in every subsequent row represents a value in the first column, the second value in every subsequent row represents a value in the second column and so on. Unlike standard sentence punctuation, there is no space after a comma.

Most values are encased in double quotes. The exception is the single-character value, such as a 1 or a 0 (zero). Enclosing the value in double quotes allows complex values such as those that contain commas to be used within a field without breaking the file structure. For example, a field that contains a series of items, such as favorite colors, might have a value like this:

"red, green and blue"

You won't necessarily be aware of these quotes when viewing the data file through a spreadsheet application, but they do show up when the file is viewed in a text editor.

Back to top

Text Editor View of a CSV Data File

Here is a simplified example of the 'company.csv' as it would look when viewed in a text editor. The first column heading in your data file should always be 'upload_action', which is used to tell the Upload Data tool what kind of action to perform on this row. In this example, a data file was downloaded in order to edit the selected recrods. The first row contains the column headings and the second and subsequent rows contain data records. The first field in every row corresponds to the 'upload_action' column heading. This is prefilled for the 'edit' action when the data file is downloaded. In this example, the second column heading is 'item_key' and the third is the company 'name'. The item key is always required for edit or delete operations and must never be edited.

"upload_action","item_key","name"

edit,"uv1618033989","Kavi Corp."

edit,"uv0618033990","Nonmember Co."

edit,"fs12358132134","Member Company, Inc."

Back to top

Tips for Manipulating CSV Files

Tips

  • The whole system breaks down if there is a missing or extraneous comma in a row. Every value after that missing or extra data field will be entered into the wrong column. In worst case scenario, the database may be corrupted so seriously that it's necessary to revert to a backup version, resulting in a loss of the most recent data changes and a colossal headache for the admin, so it's very important to maintain file structure.

  • The columns in your CSV file can appear in any order as long as the sequence is maintained. In other words, the order in which column headings appear in the first row must be repeated in the subsequent data rows, so that the data in each field can be matched up with the correct column.

  • You may omit any columns where you don't want to add or edit data unless that column is required by the Upload Data tool, database or site configuration rules. It's actually a good practice to omit unnecessary columns in order to simplify data file structure and reduce the likelihood of introducing errors in the unnecessary column. You cannot omit fields required by the Upload Data tool, but you may omit fields required by the database providing the default is appropriate for all the records you are adding or editing. If the default value isn't appropriate for any of the records in your data file, you should include that column and specify appropriate values for those records.

  • Purpose fields in the database store different versions of the values that are presented through webpages. For example, the User Purpose 'Company Representative' is stored in the database as 'company_rep'. The values are mapped to each other and converted as data is uploaded and downloaded from the database. You may find it easier to use the database value, which you can see if you download a CSV data file through the links at the top of the Upload Data file.

Back to top

Editing CSV Files in a Spreadsheet or Text Editor

CSV format files are wonderful because they're compatible with every sort of text editor and spreadsheet application, but spreadsheets tend to want to corrupt the data, so you have to be on guard against this. Text editors are generally much better behaved, so those who are comfortable maintaining CSV file structure in text editors can use whichever text editor they prefer.

But many people prefer to use a spreadsheet program, and the data is vulnerable on import (when leading zeros can be erased or long numbers converted to standard notation and rounded off) and on export (when extraneous characters can be added). You might not even see that your data had changed until you export the file to CSV and open it in a text editor. Characters that were hidden when the file was viewed through the spreadsheet application are suddenly visible when the file is viewed through a text editor. If you want to use a spreadsheet, you must prepare the spreadsheet in advance to handle numerical values, and clean up any bad characters in the output.

Preparing a Spreadsheet for Downloaded Data

If there is any chance that your data file has values that include numerals that should not be treated like normal numerical values (e.g., postal codes, IDs, etc.), you should open the file in a text editor first to determine which columns contain these kinds of values. Open a blank spreadsheet separately, select any columns where this kind of data will appear and set the data type to 'Text'. In most cases you can set every column in the spreadsheet to 'Text'. Oddly enough, the series of actions that allow you to do this is usually some varient of selecting 'Format Cell', then the 'Number' tab, then selecting 'Text'.

Testing a Spreadsheet Application

Before you upload an edited data file, test your favorite spreadsheet application to see if the file exports cleanly or not.

How to test your spreadsheet application:

  1. Open a CSV file with your spreadsheet application, edit the data and save the file in .csv format. You might want to practice on a dummy file that you have no intention of uploading or do simple data changes on a single test company or test user, or your personal account, if you have one.

  2. Open your edited .csv file in a text editor and see if the field values are demarcated by double quotes (except for single/character values) — this is a Good Thing — and there are no extraneous characters — which is a Bad Thing. The most common hidden character inserted by spreadsheet applications is a newline character, so look for '/n' or '^n' at the end of a data row. If you do see double quotes and don't see newline characters (or other oddities), proceed to the next test.

  3. If you used actual data and want to upload the data now, use the Upload Data tool's Test Run feature to perform a test upload on your file so the edits you made in your practice file aren't immediately committed to the database. If everything works as expected and the Upload Data report doesn't mention that any extraneous characters were encountered, you should be able to export cleanly to .csv from your spreadsheet application.

Cleaning Up After a Spreadsheet Application

If you discover that your naughty spreadsheet application has inserted characters into your .csv data file, you may still use this spreadsheet application, providing you clean up after it by opening your .csv data file in a text editor and removing any unwanted characters before proceeding to the data upload.

Ideally, the data in every field is enclosed in double straight quotes. If double curly quotes are present in the file after export, do a find and replace operation to replace the curly quotes with double straight quotes. Single curly quotes should be replaced with single straight quotes.

Back to top

Data File Preparation

  1. Simplify your data file or template by removing any columns that aren't required by the Upload Tool, aren't required by the database and otherwise aren't useful under the circumstances. This includes all membership-related columns (except 'member_types' which actually stores User Types), since membership data cannot be edited through the Upload Data tool.

  2. Add the 'upload_action' column and specify which action you want performed on that record (i.e., 'add', 'edit', 'delete').

  3. If you downloaded a data file and are performing an edit or delete operation, remove any rows (records) that you want to remain unchanged (i.e., rows you don't want to edit or delete).

  4. Perform Basic data checking as described in the following section.

  5. When you have completed these steps, finish your data preparation by following the instructions in How to Add Records through Upload Data or How to Edit or Delete Records through Upload Data.

Back to top

Basic Data Checking

It's important to do some basic data checking before, during or after importing data into the CSV file— whatever is most convenient for you.

Check for these and similar common errors:

  • To eliminate the most common cause of errors, be careful to preserve the proper structure in the comma/separated file.

  • Be sure the 'upload_action' column is present and correct values have been assigned as required by the Upload Data tool. It is included as the first column in the templates and data files, but if you have acquired the data through reporting tools or other sources, you need to add an 'upload_action' column.

    The 'upload_action' column must have a value specifying the action that you want have performed on that row: 'add', 'edit' or 'delete'. Records where no data is to be changed can be designated by 'none', but it is better to omit these rows entirely. Upload action values must be in all lowercase letters.

  • When adding records, always check for and remove any duplicates. The Members database won't allow the same exact name to be entered twice (unless the purposes are different), but duplicates can occur when a company or user who already exists in the database is entered again under a slightly different name (e.g., 'Example Company' and 'Example Co.', 'John Q Public' and 'Jon Quincy Public').

  • Check the format of all fields containing dates. The correct date format is YYYY/MM/DD for all date-related fields. The 'membership_start_dates' and 'membership_end_dates' are the exception to this rule. These fields may contain dates for multiple memberships as a comma-delimited string. Dates in other formats will not be rejected outright during validation, but the Upload Data tool will attempt to force the data into the YYYY/MM/DD format, which would result in data corruption in the improperly formatted date fields.

Back to top