Submit dataset requests and feature ideas here. For bug reports, use our chat support or issues tracker instead.
e.g. Shares outstanding, short interest, market capitalization, P/E ratio etc.
2
This ticket tracks the feature of releasing historical data as early as permissible. Currently, we embargo historical data strictly at a 24h cutoff to ensure that it is distributed safely as historical data for every venue, thus sidestepping real-time/delayed licensing requirements for our users. However, many venues actually define their "historical" boundary as the same date in venue local timezone OR the session end. So in theory, if a session ends at 8 PM ET, it would be possible to distribute data from the same day at 7.59 PM ET at 8 PM ET. Currently, to get data from within the trading session, you must use the live API (through Raw API or a live client of the Raw API). However, the Raw API can be unwieldy for a range of use cases that require a small amount of data from the current trading session. For example, if the user only needs a few instrument definitions, settlement prices, or wants to update a ticker tape based on subsampled OHLCVs, it is usually preferred to use a request-response model like our HTTP API; setting up and tearing down a stateful TCP subscription for the live API is probably too hefty for this feature. Once released, users should be able to access intraday historical data via HTTP API so long as they have a live data entitlement. See also: https://roadmap.databento.com/b/n0o5prm6/feature-ideas/provide-snapshots-for-historical-and-live-data
1
Full depth of book feed for Cboe Futures Exchange (CFE). CFE contains volatility futures and corporate bond index futures, such as VIX futures (VX, VXM).
0
Dividends, stock splits, mergers, ticker changes, adjusted EOD historical prices.
4
Currently, indices are indirectly supported through tradable index instruments on CME futures, ETFs, etc. and we don't provide the index values (non-tradable) themselves. This may be sourced from a feed like the Cboe Global Indices Feed or NYSE Global Index Feed.
3
Currently, equities is supported via individual prop feeds of each venue. While NASDAQ is sufficient for getting NBBO for most of the time, some users prefer something that will be more in line with actual NBBO from SIPs. This feature request tracks 3 possible modes of consolidation for both historical and live data: Databento server-side consolidation of multiple proprietary feeds Consolidated data from proprietary feed like Nasdaq Basic in lieu of SIP Consolidated data from CTA/UTP SIPs We plan on implementing 1-2 of these three options.
7
Support Parquet as a form of encoding, aside from dbn, CSV and JSON.
11
For options, including in the CME and OPRA datasets, the existing mbp-1 schema can have significant record volume despite the instruments being extremely illiquid. In practice, there are several orders of magnitude more mbp-1 records than trades, which is unwieldy to work with. Many services offer BBO summaries at a fixed interval, such as 1 or 10 minutes. This would be similar to the ohlcv-1m schema, but include BBO information that is present in mbp-1 and tbbo instead of only containing trades. This would be a new schema (or set of schemas). What kind of information would you like to see in the MBP-1 summary message? Edit: Adding a few keywords here to facilitate searchβsubsampled BBO, subsampled MBP-1, MBP-1 snapshots.
3
This feature tracks a potential backfill of our CME Globex back to Jan 2009. Note that there will some limitations: Prior to May 2017, CME Globex used a legacy FIX/FAST format with at most 10 levels of depth, millisecond resolution timestamps, and no high-granularity match/send timestamps. We will source the data from CME directly, but they do not have pcaps going back to 2009. This means that we'll not have a separate ts_recv for the history. We will need to consider how to handle the dataset naming, as "CME Globex MDP 3.0" will not be appropriate.
4
This client library makes all our historical and live features easier to integrate in C# on Windows, Linux, and Mac OS. C# (C Sharp) is currently already supported through our HTTP API and Raw TCP protocol, which are both language-agnostic.
4
There are two features tracked here with slightly different purposes: Historical snapshots and last-as-of. We may introduce a historical API endpoint like timeseries.get_last or timeseries.get_snapshot to print the last price at a given a specified time. This could default to the most latest available time. Ideally, when a user is licensed for live data, this endpoint should provide the ability to get intraday historical last prices over HTTP, which may be more convenient than a persistent streaming connection. Live snapshots. The main benefit of this would be to allow a live client to recover from any gaps or lost state. Currently, this requires replaying from start of the trading session, which is expensive. Moreover, when our replay system fails, snapshots could provide a fallback. This is distinct from "MBO book snapshots in Live" which would provide periodic snapshots in the stateful MBO schema to make it easier to consume full book data.
7
Data for Eurex, including all schemas (MBO, MBP, ohlcv, etc.).
7
Ability to get raw data payloads and historical PCAPs, as opposed to normalized data that is currently available. For example, on CME, there are shortcomings with normalized data: Normalized data schemas do not provide MDOrderPriority, which can be useful for LMM markets or make it easier to handle order quantity down-modifies. Normalized data schemas do not show full message tree of trade summary messages, making it harder and slower to associate passive-side changes after an aggressing trade is reported. This issue arises because of the asynchronous nature of CME's trade reporting vs order update publication; order updates get published after the trade summary. Currently, a user has to use ts_event to associate trade and the depleted orders. This is especially inconvenient if multiple passive orders are on the other side of a large aggressing order. On other venues, there are other message fields that our normalized format will fail to capture. The main challenge with providing PCAPs and raw data is that it is bandwidth-intensive. Furthermore, the message sizes vary, which throws off our pricing, search, filtering, and merging routine. This will introduce some asymmetry to our API design. A solution to this will likely be in the form of separate endpoint(s) on the historical side, and only be released after we've provided customer VMs.
3
Currently it's not possible to detect from the existing schemas when an instrument goes in and out of auction, or when trading is halted. Making a trading phase/trading status schema would enable users to detect these events.
4
Make our historical and live APIs easier to integrate from Java.
1