DateTime (de)Serialization Benchmarks from Python, Numpy, Chrono, and Time

October 6, 2024

Datetime parsing and rendering sometimes require optimization when iterating over a large dataset. Say you have about a couple million rows of timestamps you'd like to parse into a datatype. It could take some time if you use the wrong import or crate. In this article, I'll benchmark what it takes to load a couple million datetime stamps with Python's datetime, Numpy, Chrono, and Time

We'll explore different architectural considerations and design patterns to improve the ergonomics of using the different software libraries. Providing you with the ability to drop code into your project with minimal effort.

There are some considerations to apply when selecting the right datetime alteration software.

Leap Seconds
Leap Years
US Daylight Savings Timezone Offsets
Nanosecond Support

Where to apply these considerations is an architectural decision, for example storing all datetime strings in a database should be in UTC, would take care of Leap Seconds & Leap years and avoid having to manage US Daylight Timezone Offset. When rendering a datetime string on a webpage using technologies such as VueJS or ReactJS. We can leverage Typescript|JavaScript to transform those datetime strings from UTC into client-facing timezone aware objects. Client-facing browsers such as Firefox, Chrome, Edge, and Opera know where the user is situated based of their computer system clocks. Therefore, we can pass a UTC datetime string into new Date() and it should render in the correct timezone with the correct offset. Furthermore, we can transform those same UTC datetime strings into Python or Rust and format the objects accordingly in the event we need to render a time-series

Python's `datetime` Module

I find when working with the datetime module, its fast enough to use in a JSON-WebAPI, but to slow when loading, rendering, or generating large amounts of data. Therefore, its very well suited in enterprise projects which would could include like Django, FastAPI, or Flask.

import typing

from datetime import datetime, timedelta, timezone

T = typing.TypeVar('T')

class DateTimeUTC:
  FORMAT = '%Y-%m-%dT%H:%M:%S%z'
  def __init__(self, current_datetime: str | datetime = None) -> None:
    if isinstance(current_datetime, datetime):
      self._datetime = current_datetime
    elif isinstance(current_datetime, str):
      self._datetime = datetime.strptime(current_datetime, self.FORMAT)
    elif current_datetime is None:
      self._datetime = datetime.now(timezone.utc)
    else:
      raise NotImplementedError(current_datetime.__class__)

  def __str__(self) -> str:
    return self._datetime.strftime(self.FORMAT)
  
  def __repr__(self) -> str:
    return f'DateTimeUTC: {self}'
  
  def __eq__(self, other: T) -> bool:
    return self._datetime.__eq__(other._datetime)
  
  def __ne__(self, other: T) -> bool:
    return self._datetime.__ne__(other._datetime)
  
  def __lt__(self, other: T) -> bool:
    return self._datetime.__lt__(other._datetime)
  
  def __le__(self, other: T) -> bool:
    return self._datetime.__le__(other._datetime)
  
  def __gt__(self, other: T) -> bool:
    return self._datetime.__gt__(other._datetime)
  
  def __ge__(self, other: T) -> bool:
    return self._datetime.__ge__(other._datetime)
  
  def __add__(self, delta: timedelta | T) -> timedelta | T:
    if isinstance(delta, timedelta):
      return DateTimeUTC(self._datetime + delta)
    elif isinstance(delta, self.__class__):
      return self._datetime + delta._datetime
    else:
      raise NotImplementedError(delta.__class__)
  
  def __sub__(self, delta: timedelta | T) -> timedelta | T:
    if isinstance(delta, timedelta):
      return DateTimeUTC(self._datetime + delta)
    elif isinstance(delta, self.__class__):
      return self._datetime - delta._datetime
    else:
      raise NotImplementedError(delta.__class__)

In the implementation below, there are a series of dunder methods implemented on the DateTimeUTC object. Those methods allow for to transform, subtract, add time to an existing object using timedelta. At the core is the formatting, implemented in a similar way the new Date() object has been implemented in common browsers.

>>> future_timestamp = str(DateTimeUTC() + timedelta(hours=1))
>>> future_timestamp
'2024-10-02T06:12:10+0000'

The format %Y-%m-%dT%H:%M:%S%z provides a datetime string that new Date() can interpret without having to manage timezones in TypeScript.

new Date('2024-10-02T06:12:10+0000')

Okay, great. Now that we know how to use the datetime module. Let's benchmark the performance of it

timestamps = [DateTimeUTC(datetime(year=1800, month=1, day=1)) + timedelta(seconds=idx) for idx in range(0, 2000000)]
results = []
for idx in range(0, 10):
  start = DateTimeUTC()
  timestamps_rendered = []
  for timestamp in timestamps:
    timestamps_rendered.append(str(timestamp))

  stop = DateTimeUTC()
  duration = stop - start
  results.append(duration)

# Rendering timestamp results
[r.seconds for r in results]
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4]

Adding the timezone to the datetime object increased the duration of each benchmark by almost a whole second.

results = []
for idx in range(0, 10):
  print("Iteration: ", idx)
  start = DateTimeUTC()
  for timestamp in timestamps_rendered:
    _ = DateTimeUTC(timestamp)

  stop = DateTimeUTC()
  duration = stop - start
  results.append(duration)

# Loading timestamp results
[r.seconds for r in results]
[10, 10, 10, 10, 10, 10, 10, 10, 10, 10]

As you can see, loading the timestamp is immensely slower. We can extrapolate meaning from these results. If our JSON-WebAPIs are receiving more than, say 500,000 requests per-second. Than this is probably an area we could improve the API performance. Its just a lot of CPU time spent rendering data. We could also consider assuming every datetime string is UTC and drop handling the timezone entirely. I personally wouldn't do that; mostly because I think it is more pythonic to keep timezone intact. Explicit is better than implicit

Numpy's `np.datetime64` module

numpy provides datetime64 which is timezone unaware. Meaning, it won't handle UTC, EST, IST, JPN, or GMT. Therefore we'll need to omit the timezone data. This probably improves the performance of the module and is much more ideal for loading data into a database, rending to flat files, or producing parquet files. Like we did with the DateTimeUTC class, make a Timestamp class to encapsulate logic for datetime alterations.

TS_FORMAT = '%Y-%m-%dT%H:%M:%S%z'
NP_TS_FORMAT = '%Y-%m-%dT%H:%M:%S.%f'
T = typing.TypeVar('T')
class DatetimeEncodeError(Exception):
  pass

class TimestampPart(enum.Enum):
  Year = 'Y'
  Month = 'm'
  Week = 'W'
  Day = 'd'
  Hour = 'H'
  Minute = 'M'
  Second = 'S'
  Nanosecond = 'f'

class Timestamp(typing.NamedTuple):
  value: np.datetime64
  def __hash__(self) -> int:
    return self.value.__hash__()
  
  def __add__(self: T, delta: np.timedelta64) -> T:
    if not isinstance(delta, np.timedelta64):
      raise TypeError(f'Expected np.timedelta64, got {type(delta)}')
    
    return self.__class__(operator.add(self.value, delta))

  def __sub__(self: T, delta: np.timedelta64) -> T:
    if not isinstance(delta, np.timedelta64):
      raise TypeError(f'Expected np.timedelta64, got {type(delta)}')
    
    return self.__class__(operator.sub(self.value, delta))
  
  def __gt__(self, timestamp: T) -> bool:
    if not isinstance(timestamp, self.__class__):
      raise TypeError(f'Expected Timestamp, got {type(timestamp)}')
    
    return operator.gt(self.value, timestamp.value)
  
  def __ge__(self, timestamp: T) -> bool:
    if not isinstance(timestamp, self.__class__):
      raise TypeError(f'Expected Timestamp, got {type(timestamp)}')
    
    return operator.ge(self.value, timestamp.value)
  
  def __eq__(self, timestamp: T) -> bool:
    if not isinstance(timestamp, self.__class__):
      raise TypeError(f'Expected Timestamp, got {type(timestamp)}')
    
    return operator.eq(self.value, timestamp.value)
  
  def __le__(self, timestamp: T) -> bool:
    if not isinstance(timestamp, self.__class__):
      raise TypeError(f'Expected Timestamp, got {type(timestamp)}')
    
    return operator.le(self.value, timestamp.value)
  
  def __lt__(self, timestamp: T) -> bool:
    if not isinstance(timestamp, self.__class__):
      raise TypeError(f'Expected Timestamp, got {type(timestamp)}')
    
    return operator.lt(self.value, timestamp.value)
  
  def format(self: T, format: str = TS_FORMAT) -> str:
    str(self.value)

  @classmethod
  def Parse(cls: T, value: str) -> T:
    return cls(np.datetime64(value))
  
  def replace(self: T, replace_part: TimestampPart, replace_value: int) -> T:
    result = []
    skip_one = False
    for idx, char in enumerate(NP_TS_FORMAT):
      if skip_one is True:
        skip_one = False
        continue

      if char == '%':
        code = NP_TS_FORMAT[idx + 1]
        try:
          part_type = [part for part in TimestampPart if part.value == code][0]
        except IndexError:
          raise DatetimeEncodeError(f'Unknown datetime part: {char}{code}')
        else:
          if part_type is TimestampPart.Year:
            value = str(self.value.astype(f'datetime64[Y]'))

          elif part_type is TimestampPart.Month:
            value = str(self.value.astype(f'datetime64[M]')).rsplit('-', 1)[1]

          elif part_type is TimestampPart.Day:
            value = str(self.value.astype('datetime64[D]')).rsplit('-', 1)[1]

          elif part_type is TimestampPart.Hour:
            value = str(self.value.astype(f'datetime64[h]')).rsplit('T', 1)[1]

          elif part_type is TimestampPart.Minute:
            value = str(self.value.astype(f'datetime64[m]')).rsplit(':', 1)[1]

          elif part_type is TimestampPart.Second:
            value = str(self.value.astype(f'datetime64[s]')).rsplit(':', 1)[1]

          elif part_type is TimestampPart.Nanosecond:
            value = str(self.value.astype(f'datetime64[ns]')).rsplit('.', 1)[1]

          else:
            raise NotImplementedError(part_type)

          
          if replace_part is part_type:
            if part_type is TimestampPart.Nanosecond:
              zero_diff = 9 - len(str(replace_value))
              result.append(f'{"0" * zero_diff}{replace_value}')

            elif replace_value < 10:
              result.append(f'0{replace_value}')

            else:
              result.append(str(replace_value))
          else:
            result.append(value)

          skip_one = True
      else:
        result.append(char)

    return self.__class__(np.datetime64(''.join(result)))

As you can see, there is a lot going on. Most of the logic is used for datetime alterations and not serialization. We'll run the rendering benchmark with the same series of timestamps, but down to nanosecond support and without a timezone.

timestamps = [Timestamp(np.datetime64(datetime(year=1800, month=1, day=1) + timedelta(seconds=idx), 'ns')) for idx in range(0, 2000000)]

results = []
for idx in range(0, 10):
  start = datetime.now(timezone.utc)
  timestamps_rendered = []
  for timestamp in timestamps:
    timestamps_rendered.append(timestamp.format())
  stop = datetime.now(timezone.utc)
  duration = stop - start
  results.append(duration)

>>> [r.seconds for r in results]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

The results show sub-second serialization; lets find the mean.

sum([r.microseconds for r in results]) / 10
371,548.3

371548.3 microseconds on average, which comes to about 37.1548 milliseconds on average for each benchmark. Considerably faster than the datetime module. Which is to be expected, numpy is accelerated using c-code, and the API is in python to make data processing much more manageable. In fact it has been suggested to me in the past, that the for/loop might be whats slowing down this code and not the serialization routine numpy is performing changing the datetime from datetime64 to a string.

Let's go ahead and run a benchmark for loading the datetime string.

results = []
for idx in range(0, 10):
  start = datetime.now(timezone.utc)
  for timestamp in timestamps_rendered:
    _ = Timestamp.Parse(timestamp)
  stop = datetime.now(timezone.utc)
  duration = stop - start
  results.append(duration)

>>> [r.seconds for r in results]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Loading the datetime string shows similar sub-second results. Let's again, find the mean.

>>> sum([r.microseconds for r in results]) / 10
774,868.9

Definitely a performance hit here when loading the datetime string into np.datetime64. However, it is still a significant performance increase when its compared to Python's datetime module.

Moving into Rust, one of the expectations is our code will run at faster speeds. We also have to consider not all Python is the same. Some Python packages ship with c-code to improve performance. For example, take a look at httptools, asyncpg, and uvloop. All developed by MagicStack. Yeah, I'm kind of a fan of the software. With that said, Rust can still be written in a way to run slower than Python.

Chrono

Chrono was the first crate I used to perform datetime alterations in Rust. As a result I have an affinity for chrono more so than some of the other choices. When I design software in Rust, its primarily for machine-to-machine comms and less for Server/Client-Browser comms. With that said, I have rolled a couple Web Stacks and the builtin serialization implementing serde is used extensively throughout my code.

#[derive(Debug, Clone, serde::Deserialize, serde::Serialize)]
struct DatetimeString(chrono::DateTime<chrono::Utc>);

pub static TS_FORMAT: &str = "%Y-%m-%dT%H:%M:%S%.9f%z";

fn only_chrono() {
  let mut timestamps = Vec::new();
  for _ in 0..2_000_000 {
    timestamps.push(chrono::Utc::now());
  }
  let mut results = Vec::new();
  let mut rendered_timestamps = Vec::new();
  for _ in 0..10 {
    rendered_timestamps.clear();
    println!("Iteration {:?}", chrono::Utc::now());
    let start = chrono::Utc::now();
    for timestamp in timestamps.clone() {
      let result = timestamp.format(TS_FORMAT).to_string();
      rendered_timestamps.push(result);
    }
    let stop = chrono::Utc::now();
    let duration = stop - start;
    results.push(duration);
  }
  
  println!("Serialization Duration: {:?}", results);

  let mut results = Vec::new();
  for _ in 0..10 {
    println!("Iteration {:?}", chrono::Utc::now());
    let start = chrono::Utc::now();
    for timestamp in rendered_timestamps.clone() {
      let _ = chrono::DateTime::parse_from_str(timestamp.as_str(), TS_FORMAT).expect("Failed");
    }
    let stop = chrono::Utc::now();
    let duration = stop - start;
    results.push(duration);
  }

  println!("Deserialization Duration: {:?}", results);

}
fn with_serde_and_chrono() {
  let mut timestamps = Vec::new();
  for _ in 0..2_000_000 {
    timestamps.push(DatetimeString(chrono::Utc::now()));
  }

  let mut results = Vec::new();
  let mut rendered_timestamps = Vec::new();
  for _ in 0..10 {
    rendered_timestamps.clear();
    println!("Iteration {:?}", chrono::Utc::now());
    let start = chrono::Utc::now();
    for timestamp in timestamps.clone() {
      let result = serde_json::to_string(&timestamp).expect("Failed");
      rendered_timestamps.push(result);
    }
    let stop = chrono::Utc::now();
    let duration = stop - start;
    results.push(duration);
  }
  
  println!("Serialization Duration: {:?}", results);

  let mut results = Vec::new();
  for _ in 0..10 {
    println!("Iteration {:?}", chrono::Utc::now());
    let start = chrono::Utc::now();
    for timestamp in rendered_timestamps.clone() {
      let _: DatetimeString = serde_json::from_str(&timestamp.as_str()).expect("Failed");
    }
    let stop = chrono::Utc::now();
    let duration = stop - start;
    results.push(duration);
  }

  println!("Deserialization Duration: {:?}", results);
}

For each iteration of 2 million timestamps, which generates considerably faster than Python's datetime module. DateTime<UTC> is formatted into a datetime string with nano second precision. I presume the timezone alterations will account for some slowdown, but I won't test for that because if the software can support timezones, those should be included in the benchmark so that the maximum amount of information indicates the correct coordinate it time, regardless of where you are on Earth.

The Deserialization will take TS_FORMAT and rebuild the datetime strings into Rust types of DateTime<UTC>.

Serialization Duration: [
  TimeDelta { secs: 8, nanos: 548368000 },
  TimeDelta { secs: 8, nanos: 529499000 },
  TimeDelta { secs: 8, nanos: 547975000 },
  TimeDelta { secs: 8, nanos: 532770000 },
  TimeDelta { secs: 8, nanos: 534317000 },
  TimeDelta { secs: 8, nanos: 576425000 },
  TimeDelta { secs: 8, nanos: 519606000 },
  TimeDelta { secs: 8, nanos: 546453000 },
  TimeDelta { secs: 8, nanos: 535094000 },
  TimeDelta { secs: 8, nanos: 622479000 }]
Deserialization Duration: [
  TimeDelta { secs: 11, nanos: 785758000 },
  TimeDelta { secs: 11, nanos: 749219000 },
  TimeDelta { secs: 11, nanos: 946002000 },
  TimeDelta { secs: 11, nanos: 846905000 },
  TimeDelta { secs: 11, nanos: 558516000 },
  TimeDelta { secs: 11, nanos: 828807000 },
  TimeDelta { secs: 11, nanos: 566514000 },
  TimeDelta { secs: 11, nanos: 669675000 },
  TimeDelta { secs: 11, nanos: 751291000 },
  TimeDelta { secs: 11, nanos: 431745000 }]

I was staggered by the results. chrono is running multiple orders of magnitude slower than Python's datetime module and Numpy's datetime64 module.

A second pair of Serialization / Deserialization benchmarks to see if there is a performance difference using serde.

Serialization Duration: [
  TimeDelta { secs: 6, nanos: 741766000 },
  TimeDelta { secs: 6, nanos: 739703000 },
  TimeDelta { secs: 6, nanos: 752363000 },
  TimeDelta { secs: 6, nanos: 757359000 },
  TimeDelta { secs: 6, nanos: 783687000 },
  TimeDelta { secs: 6, nanos: 790181000 },
  TimeDelta { secs: 6, nanos: 739028000 },
  TimeDelta { secs: 6, nanos: 740279000 },
  TimeDelta { secs: 6, nanos: 735011000 },
  TimeDelta { secs: 6, nanos: 733734000 }]
Deserialization Duration: [
  TimeDelta { secs: 7, nanos: 839185000 },
  TimeDelta { secs: 7, nanos: 825194000 },
  TimeDelta { secs: 7, nanos: 813160000 },
  TimeDelta { secs: 7, nanos: 817082000 },
  TimeDelta { secs: 7, nanos: 823954000 },
  TimeDelta { secs: 7, nanos: 814741000 },
  TimeDelta { secs: 7, nanos: 817677000 },
  TimeDelta { secs: 7, nanos: 857673000 },
  TimeDelta { secs: 7, nanos: 819424000 },
  TimeDelta { secs: 7, nanos: 818405000 }]

An obvious performance increase using serde. I didn't expect these results, and I hope someone more familiar with the software can point out what why serde can serialize and deserialize DateTime<UTC> quicker than using .format.

Time

I haven't used time much. and it seems to be used reliably with a number of dependencies I've started pulling from. Therefore it has been pushed into my radar and I'd like to determine if the software is more performant than chrono.

The benchmark will be setup similar to Chrono benchmarks. We'll test formatting OffsetDateTime into a datetime string and back. Then I'll implement serde and see if there is a performance gain or loss.

#[derive(Debug, Clone, serde::Deserialize, serde::Serialize)]
struct TimeDateTimeString(#[serde(with = "time::serde::rfc3339")] time::OffsetDateTime);

fn only_time() {
  let mut timestamps = Vec::new();
  for _ in 0..2_000_000 {
    timestamps.push(time::OffsetDateTime::now_utc());
  }
  let mut results = Vec::new();
  let mut rendered_timestamps = Vec::new();
  let ts_format = "[year]-[month]-[day]T[hour repr:24 padding:none]:[minute]:[second].[subsecond digits:9][offset_hour sign:mandatory]";
  let ts_format = format_description::parse(ts_format).expect("Unable to set formatter");
  for _ in 0..10 {
    rendered_timestamps.clear();
    println!("Iteration {:?}", time::OffsetDateTime::now_utc());
    let start = time::OffsetDateTime::now_utc();
    for timestamp in timestamps.clone() {
      rendered_timestamps.push(timestamp.format(&ts_format).unwrap());
    }
    let stop = time::OffsetDateTime::now_utc();
    let duration = stop - start;
    results.push(duration);
  }
  println!("Serialization Results: {:?}", results);

  let mut results = Vec::new();
  for _ in 0..10 {
    println!("Iteration: {:?}", time::OffsetDateTime::now_utc());
    let start = time::OffsetDateTime::now_utc();
    for timestamp in rendered_timestamps.clone() {
      let _ = time::OffsetDateTime::parse(&timestamp, &ts_format).expect("Failed");
    }
    let stop = time::OffsetDateTime::now_utc();
    let duration = stop - start;
    results.push(duration);
  }

  println!("Deserialization Duration: {:?}", results);
}
fn with_serde_and_time() {
  let mut timestamps = Vec::new();
  for _ in 0..2_000_000 {
    timestamps.push(TimeDateTimeString(time::OffsetDateTime::now_utc()));
  }

  let mut results = Vec::new();
  let mut rendered_timestamps = Vec::new();
  for _ in 0..10 {
    rendered_timestamps.clear();
    println!("Iteration {:?}", time::OffsetDateTime::now_utc());
    let start = time::OffsetDateTime::now_utc();
    for timestamp in timestamps.clone() {
      let result = serde_json::to_string(&timestamp).expect("Failed");
      rendered_timestamps.push(result);
    }
    let stop = time::OffsetDateTime::now_utc();
    let duration = stop - start;
    results.push(duration);
  }
  
  println!("Serialization Duration: {:?}", results);

  let mut results = Vec::new();
  for _ in 0..10 {
    println!("Iteration {:?}", time::OffsetDateTime::now_utc());
    let start = time::OffsetDateTime::now_utc();
    for timestamp in rendered_timestamps.clone() {
      let _: DatetimeString = serde_json::from_str(&timestamp.as_str()).expect("Failed");
    }
    let stop = time::OffsetDateTime::now_utc();
    let duration = stop - start;
    results.push(duration);
  }

  println!("Deserialization Duration: {:?}", results);
}

time::OffsetDateTime API took me some time to get used to, but it seems to be more idiomatic than chrono, datetime, and datetime64. Which I think is great, but at the same time it makes it hard to find the correct syntax for say, the ts_format variable. I have specified that I'd like to have nanosecond precision, but I have no idea how to set that level of precision using OffsetDateTime::now_utc()

Serialization Results: [
  Duration { seconds: 4, nanoseconds: 939048000 },
  Duration { seconds: 4, nanoseconds: 945770000 },
  Duration { seconds: 4, nanoseconds: 956920000 },
  Duration { seconds: 4, nanoseconds: 945419000 },
  Duration { seconds: 4, nanoseconds: 949541000 },
  Duration { seconds: 4, nanoseconds: 947545000 },
  Duration { seconds: 4, nanoseconds: 951264000 },
  Duration { seconds: 4, nanoseconds: 975607000 },
  Duration { seconds: 4, nanoseconds: 949841000 },
  Duration { seconds: 4, nanoseconds: 941445000 }]
Deserialization Results: [
  Duration { seconds: 7, nanoseconds: 333048000 },
  Duration { seconds: 7, nanoseconds: 339670000 },
  Duration { seconds: 7, nanoseconds: 331340000 },
  Duration { seconds: 7, nanoseconds: 323940000 },
  Duration { seconds: 7, nanoseconds: 321096000 },
  Duration { seconds: 7, nanoseconds: 327553000 },
  Duration { seconds: 7, nanoseconds: 349085000 },
  Duration { seconds: 7, nanoseconds: 314581000 },
  Duration { seconds: 7, nanoseconds: 338267000 },
  Duration { seconds: 7, nanoseconds: 324325000 }]

Way better results than chrono datetime string formatting. Fairly close to Python's datetime module's ability to serialize datetime objects. An obvious performance improvement, but still on par with datetime and behind datetime64

A second pair of Serialization / Deserialization benchmarks to see if there is a performance difference using serde.

Serialization Duration: [
  Duration { seconds: 5, nanoseconds: 285355000 },
  Duration { seconds: 5, nanoseconds: 363159000 },
  Duration { seconds: 5, nanoseconds: 300327000 },
  Duration { seconds: 5, nanoseconds: 277747000 },
  Duration { seconds: 5, nanoseconds: 275644000 },
  Duration { seconds: 5, nanoseconds: 279714000 },
  Duration { seconds: 5, nanoseconds: 273839000 },
  Duration { seconds: 5, nanoseconds: 317712000 },
  Duration { seconds: 5, nanoseconds: 297983000 },
  Duration { seconds: 5, nanoseconds: 285739000 }]
Deserialization Duration: [
  Duration { seconds: 7, nanoseconds: 849486000 },
  Duration { seconds: 7, nanoseconds: 826524000 },
  Duration { seconds: 7, nanoseconds: 824564000 },
  Duration { seconds: 7, nanoseconds: 850677000 },
  Duration { seconds: 7, nanoseconds: 822075000 },
  Duration { seconds: 7, nanoseconds: 831446000 },
  Duration { seconds: 7, nanoseconds: 827239000 },
  Duration { seconds: 7, nanoseconds: 826451000 },
  Duration { seconds: 7, nanoseconds: 832332000 },
  Duration { seconds: 7, nanoseconds: 906543000 }]

Only a performance hit with serialization information and a very minor performance hit when deserialization. Very impressive.

Comparative Results

Module / Software	Average Serialization Duration	Average Deserialization Duration
Python's `datetime`	3 seconds	10 seconds
Numpy's `datetime64`	371,548 ms	774,868.9 ms
Chrono	8 seconds	11 seconds
Chrono & Serde	6 seconds	7 seconds
Time	4 seconds	7 seconds
Time & Serde	5 seconds	7 seconds

The comparative results have me wondering if there is a correlation between the datetime64 and time results. Obviously, if you're going to load large amounts of data. Numpy's datetime64 datetime strings are the way to go for now. I wonder how well datetime64 would perform in a Spark runtime.

Joseph Curtin

Python's datetime Module

Numpy's np.datetime64 module

Chrono

Time

Comparative Results

Python's `datetime` Module

Numpy's `np.datetime64` module