Mythbusting: Data in Your Repository

It’s no secret that data is one of the hottest topics in scholarly communications right now, and at bepress we’re excited to be a part of the conversation. As part of our ongoing data initiative, we decided to take a look at some of the most common myths and misperceptions about data in the repository and offer our counterarguments to help ease some of the fear and continue moving the conversation forward.

MYTH: All data files are big.

BUSTED: Data files come in a variety of sizes and the vast majority are under 1 GB.

Since the launch of Digital Commons in 2002, bepress has steadily increased our file-handling capabilities to stay ahead of the needs of the community. With the most recent improvements to our system, we can now accept files up to 2GB in size, making Digital Commons an ideal solution for the majority of publishable datasets generated by scholars. We’ve also doubled our bandwidth, enabling us to handle these bigger files with greater speed.

MYTH: All data files are numeric.

BUSTED: Data can have a variety of formats.

Just in Digital Commons repositories alone we’ve seen a huge variety of data formats, from 3D printable objects and seismic recordings to computer code and audio files. The days of a One Definition Data are done.

MYTH: Only researchers in the hard sciences work with data.

BUSTED: You’ll find data-driven research all over campus.

Most people tend to immediately associate the word “data” with science disciplines, but just as there are a wide variety of data formats, so are there a wide variety of fields that also work with data. When looking for data on your campus, don’t discount the arts and humanities! The wider you cast your net, the more likely you are to find rich datasets in previously unknown places.

data chart

File sizes from a random sample of 300 datasets in Figshare, Dryad, and Dataverse.


For more information about how to start a data program on your campus, check out our data resource page or contact Consulting Services at We’ll also be rebroadcasting our popular webinar “Getting Started with Research Data in Your Repository” this fall so keep an eye out for that announcement!