You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[1.0.1] remove whitespace and commented out old code on all stored programs
@@ -98,7 +98,7 @@
98
98
-[3.0.0] rename TABLES `log_clientname` to `log_client`, `log_servername` to `log_server`
99
99
-[3.0.0] rename COLUMNS `clientnameid` to `clientid`, `servernameid` to `serverid` throughout application tables and processes.
100
100
-[3.0.0] modify `process_access_parse` and `process_error_parse` WHERE CLAUSES for server_name UPDATE commands.
101
-
-[3.0.0] add 16 stored functions for log attribute tables to return names for Slice and dice is a data analysis in drill-down Web interface.
101
+
-[3.0.0] add 16 stored functions for primary attribute tables to return names for Slice and dice is a data analysis in drill-down Web interface.
102
102
-[3.0.0] modify and reworded all console log messages in `logs2mysql.py` to standardize messages for each process. Added COLORS to coordinate message types for better readability.
103
103
-[3.0.0] modify all database INDEX NAMES for standardization and consolidation.
104
104
-[3.0.0] tested simultaneously uploading logs from 10 VPS with multiple VirtualHosts on each Server processing thousands of files in different formats and millions of log records.
To contribute any Issues or Errors found using application please create a `New issue` under repository `Issues` tab.
1
+
To contribute Issues or Errors found using application please create a `New issue` under repository `Issues` tab.
2
2
3
3
To contribute Ideas or Comments please create a `New discussion` under repository `Discussions` tab.
4
4
5
-
To contribute Apache Access or Error Log Formats commonly used that application should process please start `New discussion` about that.
5
+
To contribute Apache Access or Error Log Formats commonly used that application should process please start `New discussion`.
6
6
7
7
Any organizations, people or person with multiple Apache servers that find application a godsend in log collection monetary contributions are appreciated. Repository has my ***Buy Me a Coffee*** & ***Venmo*** links.
8
8
9
-
I volunteer for a nonprofit organization that wanted to import their Apache website logs into MySQL tables to query data. The Executive Director loves MySQL so I decided to research existing solutions that used MySQL. I thought it would be two or three days of my time.
10
-
11
-
First I installed the Apache log_sql_mysql modules which did create a single MySQL mostly empty table of the access log with no control or customization and many other issues. Next I experimented with several simple log file parsers but none normalized the parsed log data into a MySQL database. Finally I reviewed other available Apache logging solutions that didn't use MySQL including GoAccess, Logstash, Apache Viewer, DataDog and others as well as CrowdStrike and Solarwinds Loggly.
12
-
13
-
Mid-September 2024 after all my research I decided to write a simple solution which snowballed into a complete application. All October I worked long hours around the clock. November I spent incorporating the application into VPS websites and applications I oversee while making improvements along the way. Version 2.0.0 fixed the major issues encountered and is the application baseline. December I spent refining the major changes made in Version 2.0.0. Version 2.1.5 was last code change to fix client identification issue when OS version changes by adding `import_device` TABLE.
14
-
15
-
First 2 weeks of January 2025 I spent processing millions of records from 10 VPS simultaneously to single MySQL Server. Version 3.0.0 is last major change with IP Address geoLocation and a final pass through to fine tune processes and rename some tables and columns. This version of application is production ready.
16
-
17
-
The final version is less Python and more SQL and much faster processing millions of records. At this point, I have over 1050 hours of research, design, iteration & development into application. It is much more time then I intended to invest into this project but it did produce my first open-source software.
18
-
19
-
That's how volunteering, lack of a viable MySQL solution and a flexible schedule came together just right to allow me to dive deep into this project.
20
-
21
-
### “Timing, degree and conviction are the three wise men in this life.” — Robert I. Fitzhenry
22
-
23
9
Monetary contributions made will be reflected in development of [Web Interface](https://github.com/WillTheFarmer/mysql-to-apache-echarts) for this MySQL `apache_logs` schema.
Copy file name to clipboardExpand all lines: .github/README.md
+16-9Lines changed: 16 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,18 +6,24 @@ and normalizing data into database designed for reports & data analysis.
6
6
Imports Access Logs in LogFormats - ***common***, ***combined*** and ***vhost_combined*** & additional ***csv2mysql***
7
7
LogFormat defined :point_down:
8
8
9
-
Imports Error Logs in ***default*** ErrorLogFormat & ***additional*** ErrorLogFormat defined below performing data harmonization on Apache Codes & Messages, System Codes & Messages, and Log Messages to create a unified, standardized dataset. Error Log view images :point_down:
9
+
Imports Error Logs in ***default*** ErrorLogFormat & ***additional*** ErrorLogFormat defined below performing data harmonization
10
+
on Apache Codes & Messages, System Codes & Messages, and Log Messages to create a unified, standardized dataset. Error Log view images :point_down:
10
11
11
-
All processing stages are encapsulated within one "Import Load" that captures process metrics, notifications and errors into MySQL import tables. Every log data record is traceable back to the computer, folder, file, load process, parse process and import process it came from.
12
+
All processing stages are encapsulated within one "Import Load" that captures process metrics, notifications and errors into MySQL import tables.
13
+
Every log data record is traceable back to the computer, folder, file, load process, parse process and import process it came from.
12
14
13
-
Multiple Access and Error logs and formats can be loaded, parsed and imported along with User Agent parsing and IP Address geoLocation retrieval in a single execution. A single execution can also be configured to only load logs to Server.
14
-
### Console Process Messages - 4 LogFormats, 2 ErrorLogFormats & 6 MySQL Stored Procedures
15
+
Multiple Access and Error logs and formats can be loaded, parsed and imported along with User Agent parsing and IP Address geolocation retrieval in a single execution.
16
+
A single execution can also be configured to only load logs to Server.
17
+
### Process Messages in Console - 4 LogFormats, 2 ErrorLogFormats & 6 MySQL Stored Procedures
New version has [MaxMind GeoIP2](https://github.com/maxmind/GeoIP2-python) Python API integration with 5 additional MySQL tables for IP geoLocation data. Two DB-IP Lite databases are required - `IP to City` and `IP to ASN`. Free DB-IP Lite databases can be found at [DB-IP](https://db-ip.com/db/lite.php)
17
-
18
-
A visualization tool for the MySQL Schema ***apache_logs*** is [MySQL2ApacheECharts](https://github.com/willthefarmer/mysql-to-apache-echarts) and currently under development. The Web interface consists of Express.js web application frameworks with Drill Down Capability & [Apache ECharts](https://github.com/apache/echarts) frameworks for Data Visualization.
19
+
ApacheLogs2MySQL has [MaxMind GeoIP2](https://github.com/maxmind/GeoIP2-python) Python API integration with 6 MySQL tables for IP geolocation data normalization.
20
+
Two DB-IP Lite databases are required - `IP to City` and `IP to ASN`. Free DB-IP Lite databases can be found at [DB-IP](https://db-ip.com/db/lite.php)
19
21
20
22
Database Schema ***apache_logs*** designed to accommodate unlimited servers & domains. Step-by-step guide for easy installation :point_down:
23
+
24
+
A visualization tool for the MySQL Schema ***apache_logs*** is [MySQL2ApacheECharts](https://github.com/willthefarmer/mysql-to-apache-echarts) and currently under development.
25
+
The Web interface consists of [Express](https://github.com/expressjs/express) web application frameworks with Drill Down Capability
26
+
& [Apache ECharts](https://github.com/apache/echarts) frameworks for Data Visualization.
21
27
## Entity Relationship Diagram of apache_logs schema tables
|%{VARNAME}C|ADDED - The contents of cookie VARNAME in request sent to server. Only version 0 cookies are fully supported. Format String is optional.|
106
112
|%L|ADDED - The request log ID from the error log (or '-' if nothing has been logged to the error log for this request). Look for the matching error log line to see what request| caused what error.
107
113
## Two supported Error Log Formats
108
-
Application processes Error Logs with ***default format*** for threaded MPMs (Multi-Processing Modules). If running Apache 2.4 on any platform and ErrorLogFormat is not defined in config files this is the Error Log format.
114
+
Application processes Error Logs with ***default format*** for threaded MPMs (Multi-Processing Modules). If running Apache 2.4 on any platform
115
+
and ErrorLogFormat is not defined in config files this is the Error Log format.
109
116
Information from: https://httpd.apache.org/docs/2.4/mod/core.html#errorlogformat
@@ -289,7 +296,7 @@ Normalization ensures that data is organized in a way that makes sense for the d
289
296
MySQL `apache_logs` schema currently has 55 Tables, 908 Columns, 188 Indexes, 72 Views, 8 Stored Procedures and 90 Functions to process Apache Access log in 4 formats
290
297
& Apache Error log in 2 formats. Database normalization at work!
291
298
## MySQL Access Log View by Browser - 1 of 66 schema views
292
-
Current schema views are Access and Error Attribute Primary tables created in normalization process with simple aggregate values.
299
+
Current schema views are Access and Error primary attribute tables created in normalization process with simple aggregate values.
293
300
These are primitive data presentations of the log data warehouse. ApacheLogs2MySQL is the 'EL' of the 'ELK' Stack. The Web interface
294
301
[MySQL2ApacheECharts](https://github.com/willthefarmer/mysql-to-apache-echarts) in development is the 'K' of the 'ELK' Stack.
Copy file name to clipboardExpand all lines: logs2mysql.py
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -38,7 +38,7 @@
38
38
fromtimeimporttime
39
39
fromtimeimportctime
40
40
fromdatetimeimportdatetime
41
-
load_dotenv() # Loads variables from .env into the environment
41
+
load_dotenv() # Loads variables from .env into the environment
42
42
mysql_host=getenv('MYSQL_HOST')
43
43
mysql_port=int(getenv('MYSQL_PORT'))
44
44
mysql_user=getenv('MYSQL_USER')
@@ -76,7 +76,7 @@
76
76
geoip2_city=getenv('GEOIP2_CITY')
77
77
geoip2_asn=getenv('GEOIP2_ASN')
78
78
geoip2_process=int(getenv('GEOIP2_PROCESS'))
79
-
# makes process start, complete, info and error messages noticeable in console - all error messages start with 'ERROR - ' for keyword log search
79
+
# Readability of process start, complete, info and error messages in console - all error messages start with 'ERROR - ' for keyword log search
80
80
classbcolors:
81
81
GREEN='\33[32m'
82
82
GREENER='\033[92m'
@@ -98,7 +98,7 @@ class bcolors:
98
98
'database': mysql_schema,
99
99
'local_infile': True
100
100
}
101
-
# information to identify & register import upload client
101
+
# Information to identify & register import load clients
102
102
defget_device_id():
103
103
sys_os=system()
104
104
ifsys_os=="Windows":
@@ -591,7 +591,7 @@ def processLogs():
591
591
# SECONDARY PROCESSES BELOW: Client Module UPLOAD is done with load, parse and import processes of access and error logs. The below processes enhance User Agent and Client IP log data.
592
592
# Initially UserAgent and GeoIP2 processes were each in separate files. After much design consideration and application experience and Code Redundancy being problematic
593
593
# the decision was made to encapsulate all processes within the same "Import Load" which captures and logs all execution metrics, notifications and errors
594
-
# into MySQL tables for each execution. Every log datarecord can be tracked back to the file, folder, computer, load process, parse process and import process it came from.
594
+
# into MySQL tables for each execution. Every log data record can be tracked back to the file, folder, computer, load process, parse process and import process it came from.
595
595
# Processes may require individual execution even when NONE of above processes are executed. If this Module is run automatically on a client server to upload Apache Logs to centralized
596
596
# MySQL Server the processes below will never be executed. In some cases, only the processes below are needed for execution on MySQL Server or another centralized computer.
597
597
# In some cases, ALL processes above and below will be executed in a single "Import Load" execution. Therefore, the encapsulation of all processes in a single module.
0 commit comments