As said in the question, the format used depends mainly on which project/domain the data comes from. But most of the time the data is useless without metadata information, and when the dataset starts to get big you need a way to extract the metadata from it automatically.
In astronomy, where most of the data has been open for decades, the International Virtual Observatory Alliance (IVOA) created the specifications of a format that is somehow a mix of html tables and xml, it's called VOTables and it contains information as where it comes from, what are the names of the columns, the units and other descriptors (base on a set of standards).
This fileformat, besides being compatible with a lot of tools used in astronomy can be also read and written in python using the astropy package. A simple votable can be read by just:
from astropy.io.votable import parse
votable = parse("votable.xml")
Great post. May I have your permission to include it on the initial post?
oh, sure, you mention VOTable, but no love for FITS? PyFITS merged into AstroPy, so you can use astropy.io.fits. I believe that SunPy (python library for solar physics) uses it, too.
@joe, you are right! fits are an option too! and they are indeed accessible from SunPy too.