File naming and formats best practices

It is essential for file submitters to prioritize durable formats to avoid loss of readability. They are also encouraged to carefully name and version their files, as this affects their identification and reuse.

Prioritize file formats that:

  • Are commonly used and high quality;
  • Can be read by multiple software applications;
  • Are not proprietary (free and open-source software);
  • Are open and well-documented;
  • Are uncompressed and unencrypted.
File type:High level of confidence:Medium level of confidence:Low level of confidence:
Text document:– Ÿ PDF/A-1 – ISO 19005-1 (.pdf)
– Plain Text – with encoding: US-ASCII, UTF-8 (.txt)
– XML – with included schema (.xml)
– HTML – include a DOCTYPE declaration (.htm, .html)
– LaTeX with referenced files (.latex, .tex, .ltx)
– Microsoft Word 2007 or newer (.docx)
– PDF – with embedded fonts (.pdf)
– Rich Text Format 1.x (.rtf)
– SGML (.sgml)
– Microsoft Word 2003 or older (.doc)
– PDF – encrypted (.pdf)
– WordPerfect (.wpd)
Spreadsheet :– Comma- or tab-separated Values (.csv, .tsv, .txt)
– Delimited text (.txt, .csv)
– SIARD: Software Independent Archiving of Relational Databases (.siard)
– Excel 2007 or newer (.xlsx)
– Open Document
– Spreadsheet (.ods)
XML (.xml)
– Excel 2003 or older (*.xls)
Presentation :– PDF/A-1 —  ISO 19005-1 (.pdf)– Microsoft Powerpoint 2007+ (OOXML) (.ppt, .pptx),
OpenDocument Presentation (.odp)
– Portable Document Format (.pdf)
– PowerPoint (.ppt)
– Keynote (.key)
Image :– PNG – 24bit (.png)
– Tiff – uncompressed (.tif, .tiff)
– Digital Negative DNG (.dng)
– GIF (.gif)
– JPEG2000 – lossless (.jp2)
– JPEG/JFIF (.jpg)
– PNG – 8 bit (.png)
– Tiff – compressed (.tif, .tiff)
– JPEG2000 – lossy (*.jp2)
– Photoshop document (.psd)
– RAW formats (.raw, etc)
Audio :– AIFF – uncompressed (.aif, .aiff)
– Free Lossless Audio Codec (.flac)
– WAV – uncompressed (.wav)
– Advanced Audio Coding (.mp4)
– Apple Lossless Audio Codec (ALAC) (.m4a)
– MP3 (.mp3)
– SUN audio — uncompressed (.au, .snd)
– AIFC — compressed AIFF (.aifc)
– RealAudio (.ra, .rm)
– WAV — compressed (.wav)
– Windows Media Audio (.wma)
Video :– AVI – uncompressed (.avi)
– QuickTime – uncompressed, motion JPEG (.mov)
– Material Exchange Format – uncompressed (.mxf)
– Motion Jpeg2000 (.jp2)
– Mpeg-1, mpeg-2 (.mp1, .mp2)
– MPEG-4 – preferably H.264 (.mp4)
– RealVideo (.rv, .rm)
– Quicktime – compressed (.mov)
– Windows Media Video (.wmv)
CAD (Computer Aided Design)– PDF/E – ISO 24517-1:2008 (.pdf)– AutoCad (.dwg)
– AutoDesk’s Drawing Interchange File Format/Data eXchange Format (.dxf)
N/A
Archives :– ZIP (.zip)
Ÿ – TAR (.tar)
Ÿ – GZIP (.gz)
Ÿ – RAR (.rar)
N/AN/A
Data :Ÿ  – JSON (.json)
Ÿ  – XML (.xml)
– SQL (.sql)
N/AN/A
Source code :Ÿ – Code files (.py, .java, .cpp, .html, .css, .js, etc.)N/AN/A

Please follow these rules for naming the files you upload to Infoscience:

  • Choose a meaningful name for your file.
  • Avoid overly long names (max. 32 characters, including the extension).
  • Do not use accented characters or special characters (such as space, #, @, &, €, +, etc.).
  • Use an underscore “_” between terms if needed.
  • Avoid conjunctions (and, on, of, about, etc.) and unnecessary articles (the, a, an, etc.).
  • Avoid abbreviations unless they are well-known at EPFL.
  • Do not include the author’s name in the title. This information is already present in the metadata of the record.

Examples:

Infoscience allows you to upload multiple editorial versions of the same document to a single record using the “create a new version” option (link).

The version indicates the status of the document based on its progress in the publication cycle.

Publication versions:

  • Preprint or submitted version: The version submitted before peer review, not yet accepted.
  • Postprint or accepted version: The version accepted after peer review but before publisher formatting.
  • Published version: The final version accepted by a journal, peer-reviewed, corrected by the author, and formatted by the journal’s editor for publication.

If the publisher’s policy allows, upload the published version, with any embargo that may be imposed. It is always permissible to upload preprints and postprints.