Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster read of csv2 format #321

Merged
merged 4 commits into from
Aug 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions benchmarks/dataframe_read_large_file.cc
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,11 @@ int main(int, char *[]) {
<< double(duration_cast<microseconds>(end - start).count()) / 1000000.0
<< " seconds\n";

/*
df.write<long, unsigned int, int, unsigned long>
("Large_File.dat", io_format::binary);
*/

return (0);
}

Expand Down
32 changes: 17 additions & 15 deletions docs/HTML/DateTime.html
Original file line number Diff line number Diff line change
Expand Up @@ -77,34 +77,34 @@
<UL>
<LI><font color="blue" size="+1"><B>Page Index</B></font></LI>
<UL>
<LI><a href="https://github.com/hosseinmoein/DataFrame?tab=readme-ov-file"><font size="+2">&#8592;</font> Back to Github</a></LI>
<LI><a href="https://htmlpreview.github.io/?https://github.com/hosseinmoein/DataFrame/blob/master/docs/HTML/DateTime.html#0"><font size="+2">&#9730;</font> Summary</a></LI>
<LI><a href="https://htmlpreview.github.io/?https://github.com/hosseinmoein/DataFrame/blob/master/docs/HTML/DateTime.html#1"><font size="+2">&#128193;</font> Code structure</a></LI>
<LI><a href="https://htmlpreview.github.io/?https://github.com/hosseinmoein/DataFrame/blob/master/docs/HTML/DateTime.html#2"><font size="+2">&#x1F6E0;</font> Build Instructions</a></LI>
<LI><a href="https://htmlpreview.github.io/?https://github.com/hosseinmoein/DataFrame/blob/master/docs/HTML/DateTime.html#3"><font size="+2">&#129513;</font> Example</a></LI>
<LI><a href="https://htmlpreview.github.io/?https://github.com/hosseinmoein/DataFrame/blob/master/docs/HTML/DateTime.html#4"><font size="+2">&#129419;</font> Types</a></LI>
<LI><a href="https://htmlpreview.github.io/?https://github.com/hosseinmoein/DataFrame/blob/master/docs/HTML/DateTime.html#5"><font size="+2">&#128477;</font> Member Functions</a></LI>
<LI><a href="https://htmlpreview.github.io/?https://github.com/hosseinmoein/DataFrame/blob/master/docs/HTML/DateTime.html#6"><font size="+2">&#127760;</font> Global DateTime Operators</a></LI>
<LI><a href="https://github.com/hosseinmoein/DataFrame?tab=readme-ov-file"><font size="+3">&#8592;</font> Back to Github</a></LI>
<LI><a href="https://htmlpreview.github.io/?https://github.com/hosseinmoein/DataFrame/blob/master/docs/HTML/DateTime.html#0"><font size="+3">&#9730;</font> Summary</a></LI>
<LI><a href="https://htmlpreview.github.io/?https://github.com/hosseinmoein/DataFrame/blob/master/docs/HTML/DateTime.html#1"><font size="+3">&#128450;</font> Code structure</a></LI>
<LI><a href="https://htmlpreview.github.io/?https://github.com/hosseinmoein/DataFrame/blob/master/docs/HTML/DateTime.html#2"><font size="+3">&#x1F6E0;</font> Build Instructions</a></LI>
<LI><a href="https://htmlpreview.github.io/?https://github.com/hosseinmoein/DataFrame/blob/master/docs/HTML/DateTime.html#3"><font size="+3">&#129513;</font> Example</a></LI>
<LI><a href="https://htmlpreview.github.io/?https://github.com/hosseinmoein/DataFrame/blob/master/docs/HTML/DateTime.html#4"><font size="+3">&#129419;</font> Types</a></LI>
<LI><a href="https://htmlpreview.github.io/?https://github.com/hosseinmoein/DataFrame/blob/master/docs/HTML/DateTime.html#5"><font size="+3">&#128477;</font> Member Functions</a></LI>
<LI><a href="https://htmlpreview.github.io/?https://github.com/hosseinmoein/DataFrame/blob/master/docs/HTML/DateTime.html#6"><font size="+3">&#127760;</font> Global DateTime Operators</a></LI>
</UL>
</UL>

<H2 ID="0"><font color="blue">Summary</font></H2>
<H2 ID="0"><font color="blue">Summary <font size="+4">&#9730;</font></font></font></H2>
<font size="+1">Since DataFrame is a statistical library, it often deals with time-series data. So, it needs to keep track of time.<BR>
The most efficient way of indexing DataFrame by time is to use an index type of time_t for second precision or double or long long integer for more precision. DateTime class provides a more elaborate handling of time. Also, it is a general handy DateTime object. DateTime is a cool and handy object to manipulate date/time with nanosecond precision and multi timezone capability. It has a very simple and intuitive interface that allows you to break date/time to their components, reassemble date/time from their components, advance or pullback date/time with different granularities, and more.</font><BR>

<BR><HR COLOR="Gray" SIZE="5">

<H2 ID="1"><font color="blue">Code structure</font></H2>
<H2 ID="1"><font color="blue">Code structure <font size="+4">&#128450;</font></font></H2>

<font size="+1">Both the header (DateTime.h) and source (DateTime.cc) files are part of the DataFrame project. They are in the usual include/Utils and src/Utils directories.</font><BR>

<H2 ID="2"><font color="blue">Build Instructions</font></H2>
<H2 ID="2"><font color="blue">Build Instructions <font size="+4">&#x1F6E0;</font></font></H2>

<font size="+1">Follow the DataFrame build instructions.</font><BR><BR>

<HR COLOR="Gray" SIZE="5">

<H2 ID="3"><font color="blue">Example</font></H2>
<H2 ID="3"><font color="blue">Example <font size="+4">&#129513;</font></font></H2>

<font size="+1">This library can have up to Nano second precision depending on what systems calls are available. These are some example code:i</font><BR>
<font size="+1">
Expand Down Expand Up @@ -136,7 +136,7 @@ <H2 ID="3"><font color="blue">Example</font></H2>

<HR COLOR="Gray" SIZE="5">

<H2 ID="4"><font color="blue">Types</font></H2>
<H2 ID="4"><font color="blue">Types <font size="+4">&#129419;</font></font></H2>
<font size="+1">These constants are used for formatting date/time into strings:</font><BR>
<font size="+1">
<pre class="code_syntax" style="color:#000000;background:#ffffff00;"><span class="line_wrapper"> <span style="color:#800000; font-weight:bold; ">enum</span> <span style="color:#800000; font-weight:bold; ">class</span> DT_FORMAT <span style="color:#800080; ">:</span> <span style="color:#800000; font-weight:bold; ">unsigned</span> <span style="color:#800000; font-weight:bold; ">short</span> <span style="color:#800000; font-weight:bold; ">int</span> <span style="color:#800080; ">{</span></span>
Expand Down Expand Up @@ -305,7 +305,7 @@ <H2 ID="4"><font color="blue">Types</font></H2>

<BR><HR COLOR="Gray" SIZE="5">

<H2 ID="5"><font color="blue">Member Functions</font></H2>
<H2 ID="5"><font color="blue">Member Functions <font size="+4">&#128477;</font></font></H2>
<font size="+1">
<pre class="code_syntax" style="color:#000000;background:#ffffff00;"><span class="line_wrapper"> <span style="color:#696969; ">// A constructor that creates a DateTime initialized to now.</span></span>
<span class="line_wrapper"> <span style="color:#696969; ">// tz: Desired time zone from DT_TIME_ZONE above.</span></span>
Expand Down Expand Up @@ -491,7 +491,7 @@ <H2 ID="5"><font color="blue">Member Functions</font></H2>
</font>

<BR><HR COLOR="Gray" SIZE="5">
<H2 ID="6"><font color="blue">Global DateTime Operators</font></H2>
<H2 ID="6"><font color="blue">Global DateTime Operators <font size="+4">&#127760;</font></font></H2>
<font size="+1">
<pre class="code_syntax" style="color:#000000;background:#ffffff00;"><span class="line_wrapper"><span style="color:#696969; ">// DateTime output operator to a stream</span></span>
<span class="line_wrapper"><span style="color:#696969; ">//</span></span>
Expand Down Expand Up @@ -544,6 +544,8 @@ <H2 ID="6"><font color="blue">Global DateTime Operators</font></H2>
<span class="line_wrapper"><span style="color:#800080; ">}</span><span style="color:#800080; ">;</span></span></pre>
</font>

<BR><a href="https://github.com/hosseinmoein/DataFrame?tab=readme-ov-file">&#8592; Back to Github</a><BR>

</body></html>

<!--
Expand Down
3 changes: 2 additions & 1 deletion include/DataFrame/Internals/DataFrame_private_decl.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

#pragma once

#include <cstdio>
#include <ranges>

// ----------------------------------------------------------------------------
Expand Down Expand Up @@ -66,7 +67,7 @@ void read_binary_(std::istream &file,
size_type starting_row,
size_type num_rows);
void read_csv_(std::istream &file, bool columns_only);
void read_csv2_(std::istream &file,
void read_csv2_(std::FILE *stream,
bool columns_only,
size_type starting_row,
size_type num_rows);
Expand Down
Loading
Loading