Skip to content

Commit 77f8b53

Browse files
committed
version 2.1.3 - process improvements
see changelog
1 parent 117d8a5 commit 77f8b53

File tree

6 files changed

+59
-39
lines changed

6 files changed

+59
-39
lines changed

.github/CHANGELOG.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
- version 2.1.0 - 12/09/2024 - request_log_id functionality
66
- version 2.1.1 - 12/11/2024 - rename column timeStamp to logged - add 4 indexes and 10 views
77
- version 2.1.2 - 12/20/2024 - several improvements
8+
- version 2.1.3 - 12/27/2024 - process improvements
89
- [1.0.1] apache_logs.error_systemCodeID corrected line - INTO logsystemCode to INTO logsystemCodeID
910
- [1.0.1] remove debugging - SELECT statement from apache_logs.process_access_import, process_error_import & normalize_useragent.
1011
- [1.0.1] remove whitespace and commented out old code on all stored programs
@@ -69,4 +70,8 @@
6970
- [2.1.2] modify `process_access_parse` and `process_error_parse` - WHERE CLAUSE for parameter `ALL` to select ONLY completed LOAD processes.
7071
- [2.1.2] reformatted SQL statements in all 66 schema views for code standardization in SQL files used to create `apacheLogs2MySQL.sql`
7172
- [2.1.2] modify all 11 `access_ua_` views SQL statements `FROM apache_logs.access_log_ua ln INNER JOIN apache_logs.access_log_useragent lua INNER JOIN apache_logs.access_log`
72-
- [2.1.2] created new `entity_relationship_diagram.png` to reflect database changes.
73+
- [2.1.2] created new `entity_relationship_diagram.png` to reflect database changes.
74+
- [2.1.3] modify `access_log` and `error_log` TABLES reordering COLUMNS to improve database design readability.
75+
- [2.1.3] modify `normalize_useragent` to removed first parameter restriction. Any string 8 characters are more can be passed.
76+
- [2.1.3] modify `apacheLogs2MySQL.py` add `completed` COLUMN to UPDATE statement to fix processing `process_access_import` and `process_error_import` with `ALL` parameter.
77+
- [2.1.3] modify `call_processes.sql` adding more comments to better describe options and parameters and overall processing.

.github/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,7 @@ Run polling module from PM2:
225225
```
226226
pm2 start watch4logs.py
227227
```
228-
Run MySQL Stored Procedures from Command Line Client or Workbench:
228+
Run MySQL Stored Procedures from Command Line Client or GUI Database Tool:
229229

230230
Passing 'ALL' as second parameter processes ALL files & records based process_status.
231231
This can be multiple importloadid values. Passing an importloadid value as a STRING processes ONLY files & records related to that importloadid.

.github/call_processes.sql

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,11 @@ process_access_parse - Parsing into proper columns. Apache Access formats use %r
55
process_access_import - normalization of parsed LOAD DATA table into 9 error tables & 3 comon log tables shared with error logs.
66
normalize_useragent - normalized Python parsed userAgent TABLE into 11 tables
77
Each Stored Procudure has 2 parameters.
8-
- IN in_processName VARCHAR(100) - This indicates the LogFormat to process.
8+
- IN in_processName VARCHAR(100) - This indicates the LogFormat to process.
9+
NOTE: For normalize_useragent parameter can be any string >= 8 characters
910
- IN in_importLoadID VARCHAR(20) - This indicates to importloadid to process. Valid values 'ALL' or a value converted to INTEGER=importloadid
11+
NOTE: if in_importLoadID='ALL' ONLY importloadID records with import_load TABLE "completed" COLUMN NOT NULL will be processed. This is to avoid
12+
interfering with Python client modules uploading files at same time as server STORED PROCEDURES executing.
1013
1114
import_file TABLE - record of every access & error log file processed by Python processFiles function. Each LOAD DATA table has importfileid COLUMN.
1215
import_load TABLE - record for every execution of Python process. Each record contains information on LogFormat log files processed.
@@ -19,8 +22,19 @@ by Python apacheLogs2MySQL.py processLogs function Or only LOAD DATA can be exec
1922
2023
LOAD DATA stage tables - load_access_combined, load_access_csv2mysql, load_access_vhost, load_error_default have a process_status COLUMN.
2124
process_status=0 - LOAD DATA tables loaded with raw log data
22-
process_status=1 - process_error_parse and process_access_parse executed on record
23-
process_status=2 - process_error_import and process_access_import executed on record
25+
process_status=1 - process_error_parse or process_access_parse executed on record
26+
process_status=2 - process_error_import or process_access_import executed on record
27+
28+
If importing multiple domains check before executing process_error_import or process_access_import STORED PROCEDURES.
29+
In order to filter and report data properly this SELECT statement should return NO RECORDS.
30+
- SELECT * FROM apache_logs.import_file WHERE server_name IS NULL;
31+
32+
import_file TABLE UPDATES should execute after _parse STORED PROCEDURES if importing multiple formats with servername (%v). Parsing will fill-in columns.
33+
if not using environment variables to SET ServerName and ServerPort in Python and no formats contain %v Format String UPDATES can be executed before Parsing.
34+
import_file TABLE UPDATES must be executed before _import STORED PROCEDURES. Below are examples of possible UPDATE STATEMENTS based on imported log file name.
35+
- UPDATE apache_logs.import_file SET server_name='farmfreshsoftware.com', server_port=443 WHERE server_name IS NULL AND name LIKE '%farmfreshsoftware%';
36+
- UPDATE apache_logs.import_file SET server_name='farmwork.app', server_port=443 WHERE server_name IS NULL AND name LIKE '%farmwork%';
37+
- UPDATE apache_logs.import_file SET server_name='ip255-255-255-255.us-east.com', server_port=443 WHERE server_name IS NULL AND name LIKE '%error%';
2438
2539
Commands below execute each Stored Procudure and process ALL importloadid based on process_status. Each VPS creates importloadid each time executed.
2640
*/
@@ -38,4 +52,4 @@ CALL process_access_import('vhost','ALL');
3852
CALL process_access_parse('csv2mysql','ALL');
3953
CALL process_access_import('csv2mysql','ALL');
4054

41-
CALL normalize_useragent('MySQL Workbench','ALL');
55+
CALL normalize_useragent('Any Process Name','ALL');

apacheLogs2MySQL.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
# See the License for the specific language governing permissions and
1111
# limitations under the License.
1212
#
13-
# version 2.1.2 - 12/20/2024 - several improvements - see changelog
13+
# version 2.1.3 - 12/27/2024 - process improvements - see changelog
1414
#
1515
# Copyright 2024 Will Raymond <[email protected]>
1616
#
@@ -651,6 +651,7 @@ def processLogs():
651651
', csv2mysqlLogProcessed=' + str(csv2mysqlFileProcessed) +
652652
', userAgentProcessed=' + str(useragentFileProcessed) +
653653
', errorOccurred=' + str(processError) +
654+
', completed=now()' +
654655
', processSeconds=' + str(processSeconds) + ' WHERE id=' + str(importLoadID) +';')
655656
importLoadCursor = conn.cursor()
656657
try:

apacheLogs2MySQL.sql

Lines changed: 31 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
-- # See the License for the specific language governing permissions and
1111
-- # limitations under the License.
1212
-- #
13-
-- # version 2.1.2 - 12/20/2024 - several improvements - see changelog
13+
-- # version 2.1.3 - 12/27/2024 - process improvements - see changelog
1414
-- #
1515
-- # Copyright 2024 Will Raymond <[email protected]>
1616
-- #
@@ -111,8 +111,13 @@ DROP TABLE IF EXISTS `access_log`;
111111
/*!50503 SET character_set_client = utf8mb4 */;
112112
CREATE TABLE `access_log` (
113113
`id` int NOT NULL AUTO_INCREMENT,
114-
`importfileid` int NOT NULL,
115-
`logged` datetime DEFAULT NULL,
114+
`logged` datetime NOT NULL,
115+
`servernameid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_servername',
116+
`serverportid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_serverport',
117+
`clientnameid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_clientname',
118+
`clientportid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_clientport',
119+
`refererid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_referer',
120+
`requestlogid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_requestlogid',
116121
`bytes_received` int NOT NULL,
117122
`bytes_sent` int NOT NULL,
118123
`bytes_transferred` int NOT NULL,
@@ -127,14 +132,9 @@ CREATE TABLE `access_log` (
127132
`reqqueryid` int DEFAULT NULL,
128133
`remoteuserid` int DEFAULT NULL,
129134
`remotelognameid` int DEFAULT NULL,
130-
`refererid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_referer',
131-
`clientnameid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_clientname',
132-
`clientportid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_clientport',
133-
`servernameid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_servername',
134-
`serverportid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_serverport',
135-
`requestlogid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_requestlogid',
136135
`cookieid` int DEFAULT NULL,
137136
`useragentid` int DEFAULT NULL,
137+
`importfileid` int NOT NULL,
138138
`added` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
139139
PRIMARY KEY (`id`),
140140
KEY `F_access_reqstatus` (`reqstatusid`),
@@ -1676,8 +1676,13 @@ DROP TABLE IF EXISTS `error_log`;
16761676
/*!50503 SET character_set_client = utf8mb4 */;
16771677
CREATE TABLE `error_log` (
16781678
`id` int NOT NULL AUTO_INCREMENT,
1679-
`importfileid` int NOT NULL,
16801679
`logged` datetime NOT NULL,
1680+
`servernameid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_servername',
1681+
`serverportid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_serverport',
1682+
`clientnameid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_clientname',
1683+
`clientportid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_clientport',
1684+
`refererid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_referer',
1685+
`requestlogid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_requestlogid',
16811686
`loglevelid` int DEFAULT NULL,
16821687
`moduleid` int DEFAULT NULL,
16831688
`processid` int DEFAULT NULL,
@@ -1687,12 +1692,7 @@ CREATE TABLE `error_log` (
16871692
`systemcodeid` int DEFAULT NULL,
16881693
`systemmessageid` int DEFAULT NULL,
16891694
`logmessageid` int DEFAULT NULL,
1690-
`refererid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_referer',
1691-
`clientnameid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_clientname',
1692-
`clientportid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_clientport',
1693-
`servernameid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_servername',
1694-
`serverportid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_serverport',
1695-
`requestlogid` int DEFAULT NULL COMMENT 'Access & Error shared normalization table - log_requestlogid',
1695+
`importfileid` int NOT NULL,
16961696
`added` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
16971697
PRIMARY KEY (`id`),
16981698
KEY `F_error_level` (`loglevelid`),
@@ -2197,12 +2197,12 @@ CREATE TABLE `import_client` (
21972197
`deviceid` varchar(200) NOT NULL,
21982198
`login` varchar(200) NOT NULL,
21992199
`expandUser` varchar(200) NOT NULL,
2200-
`platformSystem` varchar(100) DEFAULT NULL,
2201-
`platformNode` varchar(100) DEFAULT NULL,
2202-
`platformRelease` varchar(100) DEFAULT NULL,
2203-
`platformVersion` varchar(150) DEFAULT NULL,
2204-
`platformMachine` varchar(100) DEFAULT NULL,
2205-
`platformProcessor` varchar(150) DEFAULT NULL,
2200+
`platformSystem` varchar(100) NOT NULL,
2201+
`platformNode` varchar(100) NOT NULL,
2202+
`platformRelease` varchar(100) NOT NULL,
2203+
`platformVersion` varchar(150) NOT NULL,
2204+
`platformMachine` varchar(100) NOT NULL,
2205+
`platformProcessor` varchar(150) NOT NULL,
22062206
`added` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
22072207
PRIMARY KEY (`id`)
22082208
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='Table keeps track of all application Windows, Linux and Mac clients loading logs to server application and long with logon and IP address information. It is important to know who is loading logs.';
@@ -2314,7 +2314,7 @@ CREATE TABLE `import_format` (
23142314

23152315
LOCK TABLES `import_format` WRITE;
23162316
/*!40000 ALTER TABLE `import_format` DISABLE KEYS */;
2317-
INSERT INTO `import_format` VALUES (1,'common',NULL,'2024-12-20 16:31:09'),(2,'combined',NULL,'2024-12-20 16:31:09'),(3,'vhost',NULL,'2024-12-20 16:31:09'),(4,'csc2mysql',NULL,'2024-12-20 16:31:09'),(5,'error_default',NULL,'2024-12-20 16:31:09'),(6,'error_vhost',NULL,'2024-12-20 16:31:09');
2317+
INSERT INTO `import_format` VALUES (1,'common',NULL,'2024-12-27 12:55:32'),(2,'combined',NULL,'2024-12-27 12:55:32'),(3,'vhost',NULL,'2024-12-27 12:55:32'),(4,'csc2mysql',NULL,'2024-12-27 12:55:32'),(5,'error_default',NULL,'2024-12-27 12:55:32'),(6,'error_vhost',NULL,'2024-12-27 12:55:32');
23182318
/*!40000 ALTER TABLE `import_format` ENABLE KEYS */;
23192319
UNLOCK TABLES;
23202320

@@ -2410,12 +2410,12 @@ DROP TABLE IF EXISTS `import_server`;
24102410
/*!50503 SET character_set_client = utf8mb4 */;
24112411
CREATE TABLE `import_server` (
24122412
`id` int NOT NULL AUTO_INCREMENT,
2413-
`dbuser` varchar(255) DEFAULT NULL,
2414-
`dbhost` varchar(255) DEFAULT NULL,
2415-
`dbversion` varchar(255) DEFAULT NULL,
2416-
`dbsystem` varchar(255) DEFAULT NULL,
2417-
`dbmachine` varchar(255) DEFAULT NULL,
2418-
`serveruuid` varchar(255) DEFAULT NULL,
2413+
`dbuser` varchar(255) NOT NULL,
2414+
`dbhost` varchar(255) NOT NULL,
2415+
`dbversion` varchar(255) NOT NULL,
2416+
`dbsystem` varchar(255) NOT NULL,
2417+
`dbmachine` varchar(255) NOT NULL,
2418+
`serveruuid` varchar(255) NOT NULL,
24192419
`added` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
24202420
PRIMARY KEY (`id`)
24212421
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='Table for keeping track of log processing servers and login information.';
@@ -4381,8 +4381,8 @@ BEGIN
43814381
IF CONVERT(in_importLoadID, UNSIGNED) = 0 AND in_importLoadID != 'ALL' THEN
43824382
SIGNAL SQLSTATE '22003' SET MESSAGE_TEXT = 'Invalid parameter value for in_importLoadID. Must be convert to number or be ALL';
43834383
END IF;
4384-
IF FIND_IN_SET(in_processName, "default,MySQL WorkBench,Python Processed") = 0 THEN
4385-
SIGNAL SQLSTATE '22003' SET MESSAGE_TEXT = 'Invalid parameter value for in_processName. Must be default,MySQL WorkBench OR Python Processed';
4384+
IF LENGTH(in_processName) < 8 THEN
4385+
SIGNAL SQLSTATE '22003' SET MESSAGE_TEXT = 'Invalid parameter value for in_processName. Must be minimum of 8 characters';
43864386
END IF;
43874387
IF NOT CONVERT(in_importLoadID, UNSIGNED) = 0 THEN
43884388
SET importLoad_ID = CONVERT(in_importLoadID, UNSIGNED);

watch4logs.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
# See the License for the specific language governing permissions and
1111
# limitations under the License.
1212
#
13-
# version 2.1.2 - 12/20/2024 - several improvements - see changelog
13+
# version 2.1.3 - 12/27/2024 - process improvements - see changelog
1414
#
1515
# Copyright 2024 Will Raymond <[email protected]>
1616
#

0 commit comments

Comments
 (0)