Skip to content

Commit c0e5403

Browse files
committed
rename column - add 4 indexes and 10 views
See CHANGELOG.md for details
1 parent 02d2496 commit c0e5403

File tree

5 files changed

+401
-27
lines changed

5 files changed

+401
-27
lines changed

.github/CHANGELOG.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
- version 1.1.1 - 11/20/2024 - keyword replacement
44
- version 2.0.0 - 11/30/2024 - backward incompatible
55
- version 2.1.0 - 12/09/2024 - request_log_id functionality
6+
- version 2.1.1 - 12/11/2024 - rename column timeStamp to logged - add 4 indexes and 10 views
67
- [1.0.1] apache_logs.error_systemCodeID corrected line - INTO logsystemCode to INTO logsystemCodeID
78
- [1.0.1] remove debugging - SELECT statement from apache_logs.process_access_import, process_error_import & normalize_useragent.
89
- [1.0.1] remove whitespace and commented out old code on all stored programs
@@ -49,4 +50,10 @@
4950
- [2.1.0] modify process_access_import - add servernameid to requestlogid before normalization to avoid duplicate requestlogid on consolidation of multiple domain logs.
5051
- [2.1.0] modify process_error_import - add servernameid to requestlogid before normalization to avoid duplicate requestlogid on consolidation of multiple domain logs.
5152
- [2.1.0] modify apacheLogs2MySQL.py - add .replace('"', ' in.') to all useragent attributes before UPDATE statement execute. The " in attribute value breaks UPDATE String.
52-
- [2.1.0] update README.md to describe and explain additional request_log_id functionality
53+
- [2.1.0] update README.md to describe and explain additional request_log_id functionality
54+
- [2.1.1] rename COLUMN `timeStamp` to `logged` in TABLES `access_log` and `error_log`.
55+
- [2.1.1] add access_log INDEXES `I_access_log_logged` and `I_access_log_servernameid_logged`.
56+
- [2.1.1] add error_log INDEXES `I_error_log_logged` and `I_error_log_servernameid_logged`.
57+
- [2.1.1] modify `process_access_import` and `process_error_imort` for COLUMN rename `timeStamp` to `logged` in TABLES `access_log` and `error_log`.
58+
- [2.1.1] add access_log views `access_period_year_list`, `access_period_month_list`, `access_period_week_list`, `access_period_day_list`, `access_period_hour_list`.
59+
- [2.1.1] add error_log views `error_period_year_list`, `error_period_month_list`, `error_period_week_list`, `error_period_day_list`, `error_period_hour_list`.

.github/README.md

Lines changed: 14 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -7,30 +7,30 @@ Imports Access Logs in Apache LogFormats - ***common***, ***combined*** and ***v
77

88
Imports Error Logs in Apache ***default*** ErrorLogFormat & ***additional*** ErrorLogFormat defined below performing data harmonization on Apache Codes & Messages, System Codes & Messages, and Log Messages to create a unified, standardized dataset. Error Log view images below.
99

10-
Application has two options to associate ServerName & ServerPort with Access and Error logs missing `%v - canonical ServerName` and `%p - canonical ServerPort` Format Strings described below.
10+
Two options to associate ServerName & ServerPort with Access and Error logs missing `%v - canonical ServerName` and `%p - canonical ServerPort` Format Strings described below.
1111

1212
4 LogFormats & 2 ErrorLogFormats can be loaded and 5 MySQL Stored Procedures can be processed in a single Python `ProcessLogs function` execution.
1313

14-
Database system is designed to accommodate unlimited domains. Easy MySQL database installation with 3 simple steps.
15-
## MySQL Access Log View by Browser - 1 of 56 schema views
14+
Database system designed to accommodate unlimited domains. Easy MySQL database install with 3 simple steps.
15+
## MySQL Access Log View by Browser - 1 of 66 schema views
1616
MySQL View - apache_logs.access_ua_browser_family_list - data from LogFormat: combined & csv2mysql
1717
![view-access_ua_browser_family_list.png](./assets/access_ua_browser_list.png)
1818
## Application Description
19-
This is a fast, reliable processing application with detailed logging and two-staged data parsing. First stage is performed in LOAD DATA statements and second stage is performed in _parse Stored Procedures.
19+
This is a fast, reliable processing application with detailed logging and two stages of data parsing. First stage is performed in LOAD DATA statements and second stage is performed in _parse Stored Procedures.
2020

21-
Data parsing can be customize in process_access_parse and process_error_parse MySQL Stored Procedures by adding or modifying SQL UPDATE statements if required.
21+
If required data parsing can be customize in process_access_parse and process_error_parse MySQL Stored Procedures by adding or modifying SQL UPDATE statements.
2222

23-
Python handles polling of log file folders and executing MySQL Database LOAD DATA statements, Stored Procedures & Functions and SQL Statements. Python drives the application but MySQL does all Data Manipulation & Processing.
23+
Python handles polling of log file folders and executing MySQL Database LOAD DATA statements, Stored Procedures, Stored Functions and SQL Statements. Python drives the application but MySQL does all Data Manipulation & Processing.
2424

25-
There is no need to move log files. Log files can be left in the folder they were imported from for later referencing. Application knows what files have been processed. Application runs with no need for user interaction.
25+
There is no need to move log files. Log files can be left in the folder they were imported from for later referencing. Application records what files have been processed in `apache_logs.import_file` TABLE. Application runs with no need for user interaction.
2626

27-
Log-level variables can be set to display info messages in console and inserted into PM2 logs for every process step. All import errors in Python processLogs (client) and MySQL Stored Procedures (server) are inserted into apache_logs.import_error TABLE. This is the only schema table that uses ENGINE=MYISAM to avoid TRANSACTION ROLLBACKS.
27+
Log-level variables can be set to display info messages in console or insert into PM2 logs for every process step. All import errors in Python processLogs (client) and MySQL Stored Procedures (server) are inserted into `apache_logs.import_error` TABLE. This is the only schema table that uses ENGINE=MYISAM to avoid TRANSACTION ROLLBACKS.
2828

29-
Logging functionality, database design and table relationship contraints produce both physical integrity and logical integrity. This enables a complete audit trail providing the ability to determine when, where and what file each record originated from.
29+
Logging functionality, database design and table relationship contraints produce both physical integrity and logical integrity. This enables a complete audit trail providing ability to determine when, where and what file each record originated from.
3030

31-
All folder pathnames, filename patterns, logging, MySQL connection setting variables are in .env file for easy installation and maintenance.
31+
All folder paths, filename patterns, logging, processing, MySQL connection setting variables are in .env file for easy installation and maintenance.
3232

33-
The Python modules can run in PM2 daemon process manager for 24/7 online processing. These modules can be run on multiple computers feeding a single server module.
33+
Two Python modules can run in PM2 daemon process manager for 24/7 online processing. Client modules can be run on multiple computers feeding a single Server module simultaneous.
3434

3535
Application is developed with Python 3.12, MySQL and 4 Python modules. Modules are listed with Python Package Index link, install command for each platform & GitHub Repository link.
3636
## Required Python Modules
@@ -42,9 +42,7 @@ Python module links & install command lines for each platform. Single quotes aro
4242
|[watchdog](https://pypi.org/project/watchdog/)|pip install watchdog|sudo apt-get install python3-watchdog|python3 -m pip install watchdog|[gorakhargosh/watchdog](https://github.com/gorakhargosh/watchdog/tree/master)|
4343
|[python-dotenv](https://pypi.org/project/python-dotenv/)|pip install python-dotenv|sudo apt-get install python3-dotenv|python3 -m pip install python-dotenv|[theskumar/python-dotenv](https://github.com/theskumar/python-dotenv)|
4444
## Four Supported Access Log Formats
45-
Apache uses same Standard Access LogFormats (***common***, ***combined***, ***vhost_combined***) on all 3 platforms. Each LogFormat adds 2 Format Strings to
46-
the prior. Format String descriptions are listed below each LogFormat. Information from:
47-
https://httpd.apache.org/docs/2.4/mod/mod_log_config.html#logformat
45+
Apache uses same Standard Access LogFormats (***common***, ***combined***, ***vhost_combined***) on all 3 platforms. Each LogFormat adds 2 Format Strings to the prior. Format String descriptions are listed below each LogFormat. Information from: https://httpd.apache.org/docs/2.4/mod/mod_log_config.html#logformat
4846
```
4947
LogFormat "%h %l %u %t \"%r\" %>s %O" common
5048
```
@@ -102,8 +100,7 @@ LogFormat "%v,%p,%h,%l,%u,%t,%I,%O,%S,%B,%{ms}T,%D,%^FB,%>s,\"%H\",\"%m\",\"%U\"
102100
|%{VARNAME}C|ADDED - The contents of cookie VARNAME in request sent to server. Only version 0 cookies are fully supported. Format String is optional.|
103101
|%L|ADDED - The request log ID from the error log (or '-' if nothing has been logged to the error log for this request). Look for the matching error log line to see what request| caused what error.
104102
## Two supported Error Log Formats
105-
Application processes Error Logs with ***default format*** for threaded MPMs (Multi-Processing Modules). If you're running Apache 2.4 on any platform and ErrorLogFormat is not defined in config files this is the Error Log format. Information from:
106-
https://httpd.apache.org/docs/2.4/mod/core.html#errorlogformat
103+
Application processes Error Logs with ***default format*** for threaded MPMs (Multi-Processing Modules). If you're running Apache 2.4 on any platform and ErrorLogFormat is not defined in config files this is the Error Log format. Information from: https://httpd.apache.org/docs/2.4/mod/core.html#errorlogformat
107104
```
108105
ErrorLogFormat "[%{u}t] [%-m:%l] [pid %P:tid %T] %7F: %E: [client\ %a] %M% ,\ referer\ %{Referer}i"
109106
```
@@ -249,7 +246,7 @@ The second parameter enables Python Client modules to run simultaneously on mult
249246
Database normalization is the process of organizing data in a relational database to improve data integrity and reduce redundancy.
250247
Normalization ensures that data is organized in a way that makes sense for the data model and attributes, and that the database functions efficiently.
251248

252-
MySQL `apache_logs` Schema has 47 Tables, 779 Columns, 125 Indexes, 56 Views, 7 Stored Procedures and 42 Functions to process Apache Access log in 4 formats
249+
MySQL `apache_logs` Schema has 47 Tables, 855 Columns, 131 Indexes, 66 Views, 7 Stored Procedures and 42 Functions to process Apache Access log in 4 formats
253250
& Apache Error log in 2 formats. Database normalization at work!
254251
## MySQL Access Log View by URI
255252
MySQL View - apache_logs.access_requri_list - data from LogFormat: combined & csv2mysql

apacheLogs2MySQL.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
# See the License for the specific language governing permissions and
1111
# limitations under the License.
1212
#
13-
# version 2.1.0 - 12/09/2024 - add request_log_id to error & access formats
13+
# version 2.1.1 - 12/11/2024 - rename column timeStamp to logged - add 4 indexes and 10 views
1414
#
1515
# Copyright 2024 Will Raymond <[email protected]>
1616
#

0 commit comments

Comments
 (0)