From b25289422f35f4d293e780b24962a26f85f1fe07 Mon Sep 17 00:00:00 2001 From: "David M. Hollenstein" Date: Tue, 23 Apr 2024 16:35:14 +0200 Subject: [PATCH 1/9] docs: Correct version of the latest release to 0.1.0 --- CHANGELOG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 6c3de0d..edaa2b7 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,7 +2,7 @@ ---------------------------------------------------------------------------------------- -## Version [0.0.10] - Interface and template rewrite +## Version [0.1.0] - Interface and template rewrite Released: 2024-04-23 This release introduces a new public interface for reading YAML template files, applying the template to a table (which now creates compiled table sections), and for writing compiled table sections to an Excel worksheet. In addition, with the `ReportBuilder` class a convenient high-level interface is introduced for creating (multi-tab) Excel reports from tables and table templates. Moreover, the YAML table templates have been partially redesigned, breaking backward compatibility. From e24001ecccfb2241ba21235ef1495c0ce8dc5847 Mon Sep 17 00:00:00 2001 From: "David M. Hollenstein" Date: Thu, 19 Sep 2024 18:19:32 +0200 Subject: [PATCH 2/9] docs: Add MaxQuant combined proteins example file --- examples/proteinGroups.txt | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) create mode 100644 examples/proteinGroups.txt diff --git a/examples/proteinGroups.txt b/examples/proteinGroups.txt new file mode 100644 index 0000000..7489169 --- /dev/null +++ b/examples/proteinGroups.txt @@ -0,0 +1,23 @@ +Protein IDs Majority protein IDs Peptide counts (all) Peptide counts (razor+unique) Peptide counts (unique) Protein names Gene names Fasta headers Number of proteins Peptides Razor + unique peptides Unique peptides Peptides H900Y000_1 Peptides H900Y000_2 Peptides H900Y000_3 Peptides H900Y030_1 Peptides H900Y030_2 Peptides H900Y030_3 Peptides H900Y100_1 Peptides H900Y100_2 Peptides H900Y100_3 Razor + unique peptides H900Y000_1 Razor + unique peptides H900Y000_2 Razor + unique peptides H900Y000_3 Razor + unique peptides H900Y030_1 Razor + unique peptides H900Y030_2 Razor + unique peptides H900Y030_3 Razor + unique peptides H900Y100_1 Razor + unique peptides H900Y100_2 Razor + unique peptides H900Y100_3 Unique peptides H900Y000_1 Unique peptides H900Y000_2 Unique peptides H900Y000_3 Unique peptides H900Y030_1 Unique peptides H900Y030_2 Unique peptides H900Y030_3 Unique peptides H900Y100_1 Unique peptides H900Y100_2 Unique peptides H900Y100_3 Sequence coverage [%] Unique + razor sequence coverage [%] Unique sequence coverage [%] Mol. weight [kDa] Sequence length Sequence lengths Fraction average Fraction 45 Fraction 55 Fraction 65 Fraction 75 Q-value Score Identification type H900Y000_1 Identification type H900Y000_2 Identification type H900Y000_3 Identification type H900Y030_1 Identification type H900Y030_2 Identification type H900Y030_3 Identification type H900Y100_1 Identification type H900Y100_2 Identification type H900Y100_3 Sequence coverage H900Y000_1 [%] Sequence coverage H900Y000_2 [%] Sequence coverage H900Y000_3 [%] Sequence coverage H900Y030_1 [%] Sequence coverage H900Y030_2 [%] Sequence coverage H900Y030_3 [%] Sequence coverage H900Y100_1 [%] Sequence coverage H900Y100_2 [%] Sequence coverage H900Y100_3 [%] Intensity Intensity H900Y000_1 Intensity H900Y000_2 Intensity H900Y000_3 Intensity H900Y030_1 Intensity H900Y030_2 Intensity H900Y030_3 Intensity H900Y100_1 Intensity H900Y100_2 Intensity H900Y100_3 iBAQ peptides iBAQ iBAQ H900Y000_1 iBAQ H900Y000_2 iBAQ H900Y000_3 iBAQ H900Y030_1 iBAQ H900Y030_2 iBAQ H900Y030_3 iBAQ H900Y100_1 iBAQ H900Y100_2 iBAQ H900Y100_3 LFQ intensity H900Y000_1 LFQ intensity H900Y000_2 LFQ intensity H900Y000_3 LFQ intensity H900Y030_1 LFQ intensity H900Y030_2 LFQ intensity H900Y030_3 LFQ intensity H900Y100_1 LFQ intensity H900Y100_2 LFQ intensity H900Y100_3 MS/MS count H900Y000_1 MS/MS count H900Y000_2 MS/MS count H900Y000_3 MS/MS count H900Y030_1 MS/MS count H900Y030_2 MS/MS count H900Y030_3 MS/MS count H900Y100_1 MS/MS count H900Y100_2 MS/MS count H900Y100_3 MS/MS count Peptide sequences Only identified by site Reverse Potential contaminant id Peptide IDs Peptide is razor Mod. peptide IDs Evidence IDs MS/MS IDs Best MS/MS Oxidation (M) site IDs Oxidation (M) site positions Taxonomy IDs +A0A024RBG1 A0A024RBG1 5 1 1 NUDT4 sp|A0A024RBG1|NUD4B_HUMAN Diphosphoinositol polyphosphate phosphohydrolase NUDT4B OS=Homo sapiens OX=9606 GN=NUDT4B PE=3 SV=1 1 5 1 1 4 5 5 5 5 5 5 5 5 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 33.7 6.6 6.6 20.434 181 181 75 8 0.0026496 2.0059 By matching By matching By matching By MS/MS By matching By matching By matching By matching By matching 27.1 33.7 33.7 33.7 33.7 33.7 33.7 33.7 33.7 22826000 0 1739700 2157200 4107300 2197300 2219700 3689400 4489900 2225100 7 3260800 0 248530 308170 586750 313900 317100 527060 641420 317870 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 "AACLCFR;EVYEEAGVK;LLGIFEQNQDRK;PVHAEYLEK;YPDQWIVPGGGMEPEEEPGGAAVR" 0 "121;19733;42163;55725;82303" "False;False;True;False;False" "126;19915;42566;56336;83165" "1365;1366;1367;1368;1369;1370;1371;1372;1373;203111;203112;203113;203114;203115;203116;203117;203118;203119;438372;438373;438374;438375;438376;438377;438378;438379;577813;577814;577815;577816;577817;577818;577819;577820;577821;852117;852118;852119;852120;852121;852122;852123;852124;852125" "805;806;807;808;809;810;811;812;813;123960;123961;123962;123963;123964;123965;123966;123967;123968;268422;353380;353381;353382;353383;353384;353385;353386;353387;353388;518745;518746;518747;518748;518749;518750;518751;518752" "806;123965;268422;353383;518749" -1 +A0JNW5 A0JNW5 5 5 5 UHRF1-binding protein 1-like UHRF1BP1L sp|A0JNW5|UH1BL_HUMAN UHRF1-binding protein 1-like OS=Homo sapiens OX=9606 GN=UHRF1BP1L PE=1 SV=2 1 5 5 5 4 5 4 5 4 4 4 5 2 4 5 4 5 4 4 4 5 2 4 5 4 5 4 4 4 5 2 4.3 4.3 4.3 164.2 1464 1464 68.5 8 29 0 8.3292 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 3.2 4.3 3.7 4.3 3.6 3.4 3.6 4.3 1.6 126780000 12666000 16754000 18147000 17300000 12934000 12145000 13088000 15994000 7755500 82 48762 8234.4 5243.1 221300 6157.7 4057.5 4175.8 7632.4 6923.1 6337.8 20023000 16256000 16714000 15132000 14563000 16808000 14992000 16338000 17315000 3 4 4 1 3 3 2 4 1 25 "AGDSCNHWMYFSDATK;HYLCNRPVGSDQK;SCDNFNLLHPIFQR;SECHQDQPR;SVTVNHMSDNR" 10 "2076;30471;62650;63099;67941" "True;True;True;True;True" "2102;30760;63308;63762;68656" "21338;21339;21340;21341;21342;21343;21344;315842;315843;315844;315845;315846;315847;315848;648686;648687;648688;648689;648690;648691;648692;648693;648694;653183;653184;653185;653186;653187;653188;653189;653190;702250;702251;702252;702253;702254;702255" "13106;13107;13108;13109;13110;13111;194009;194010;194011;194012;194013;393944;393945;393946;393947;393948;393949;393950;393951;396658;396659;426370;426371;426372;426373;426374" "13109;194010;393946;396658;426372" -1 +A0MZ66 A0MZ66 4 4 4 Shootin-1 KIAA1598 sp|A0MZ66|SHOT1_HUMAN Shootin-1 OS=Homo sapiens OX=9606 GN=SHTN1 PE=1 SV=4 1 4 4 4 3 3 3 4 4 2 3 2 3 3 3 3 4 4 2 3 2 3 3 3 3 4 4 2 3 2 3 8.6 8.6 8.6 71.639 631 631 66.6 4 15 13 0 14.305 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 6 5.2 7.1 8.6 8.6 3.8 7.1 3.8 7.1 262610000 13146000 33116000 23895000 56767000 54764000 11071000 32431000 15972000 21452000 37 3447600 275710 360350 248460 770550 798980 299220 194460 243710 256150 19675000 21634000 17513000 21679000 20962000 0 17577000 15636000 18280000 2 3 2 3 5 2 3 1 3 24 "EQAIGEYEDLRAENQK;HSVDELQK;LTQQLEEER;TLEAEFNSPSPPTPEPGEGPR" 11 "17768;30046;46114;70467" "True;True;True;True" "17935;30331;46544;71209" "182789;182790;182791;182792;182793;182794;182795;182796;182797;182798;182799;182800;311151;311152;311153;311154;311155;311156;311157;311158;311159;479086;479087;479088;479089;728575;728576;728577;728578;728579;728580;728581" "111687;111688;111689;111690;111691;111692;111693;111694;111695;190970;190971;190972;190973;190974;190975;190976;190977;293239;293240;293241;293242;442532;442533;442534" "111694;190971;293242;442533" -1 +A0PJW6 A0PJW6 1 1 1 Transmembrane protein 223 TMEM223 sp|A0PJW6|TM223_HUMAN Transmembrane protein 223 OS=Homo sapiens OX=9606 GN=TMEM223 PE=1 SV=1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 12.4 12.4 12.4 22.049 202 202 70 9 9 0 4.2797 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 12.4 12.4 12.4 12.4 12.4 12.4 12.4 12.4 12.4 303250000 34469000 37278000 35404000 32062000 33340000 31864000 37295000 29695000 31841000 12 25271000 2872400 3106500 2950300 2671800 2778400 2655300 3107900 2474500 2653400 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 1 1 8 AGGQQVTLTTHAPFGLGAHFTVPLK 12 2183 TRUE 2210 "22413;22414;22415;22416;22417;22418;22419;22420;22421;22422;22423;22424;22425;22426;22427;22428;22429;22430" "13790;13791;13792;13793;13794;13795;13796;13797;13798;13799" 13794 -1 +A0PJZ3 A0PJZ3 4 4 4 Glucoside xylosyltransferase 2 GXYLT2 sp|A0PJZ3|GXLT2_HUMAN Glucoside xylosyltransferase 2 OS=Homo sapiens OX=9606 GN=GXYLT2 PE=2 SV=2 1 4 4 4 2 2 4 3 2 2 3 3 2 2 2 4 3 2 2 3 3 2 2 2 4 3 2 2 3 3 2 9.7 9.7 9.7 51.055 443 443 63.3 13 1 9 0 4.1119 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By matching By MS/MS By MS/MS 4.5 4.5 9.7 7.7 4.5 4.5 7.7 7.7 4.5 133460000 10289000 10925000 20504000 14655000 12701000 12227000 19300000 20621000 12238000 25 2352200 213810 246250 442510 173860 283870 272460 249130 209010 261260 10324000 11036000 8185200 7384800 12057000 10831000 9078600 10333000 11825000 2 1 3 1 1 1 0 2 1 12 "EAEHEGVSVLHGNR;HVIIHVGPNQMH;LEETLVMLK;LFLPVILK" 13 "12821;30295;39410;40112" "True;True;True;True" "12946;30582;39785;40489" "132144;132145;132146;132147;313803;313804;313805;313806;313807;313808;313809;313810;313811;409206;416464;416465;416466;416467;416468;416469;416470;416471;416472" "80499;192668;192669;192670;192671;192672;250446;254854;254855;254856;254857;254858" "80499;192671;250446;254855" -1 +A0PK00 A0PK00 1 1 1 Transmembrane protein 120B TMEM120B sp|A0PK00|T120B_HUMAN Transmembrane protein 120B OS=Homo sapiens OX=9606 GN=TMEM120B PE=1 SV=1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 1 1 0 1 1 1 0 1 5 5 5 40.245 339 339 65 7 0 17.24 By matching By matching By matching By matching By MS/MS By matching By MS/MS 5 5 5 0 5 5 5 0 5 55466000 8348000 5464000 4793700 0 6396100 5843800 13561000 0 11060000 13 4266600 642160 420310 368750 0 492000 449520 1043100 0 850740 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 2 EWHELEGEFQELQETHR 14 19773 TRUE 19955 "203507;203508;203509;203510;203511;203512;203513" "124191;124192" 124191 -1 +A1A4S6 A1A4S6 5 5 5 Rho GTPase-activating protein 10 ARHGAP10 sp|A1A4S6|RHG10_HUMAN Rho GTPase-activating protein 10 OS=Homo sapiens OX=9606 GN=ARHGAP10 PE=1 SV=1 1 5 5 5 4 4 4 4 5 5 4 4 4 4 4 4 4 5 5 4 4 4 4 4 4 4 5 5 4 4 4 7 7 7 89.374 786 786 66.1 9 7 22 0 10.506 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 5.9 5.9 5.9 5.9 7 7 6.1 5.9 6.1 311370000 35194000 27764000 36187000 45821000 50063000 51991000 9214900 42781000 12355000 49 4982400 573530 412600 584220 803770 887470 857040 48720 724280 90742 34446000 34541000 35899000 32182000 34011000 44655000 34079000 35868000 37923000 3 2 2 2 4 4 3 3 2 25 "CIDASLR;HLTNVSNHSK;MGFTIIR;NHLLADGGSFGDWASTIPGQTR;QWLEALGGK" 15 "7150;29298;48101;50759;59218" "True;True;True;True;True" "7228;29575;48573;51314;59847" "73773;73774;73775;73776;73777;73778;73779;302729;302730;302731;302732;302733;302734;302735;302736;302737;499592;499593;499594;499595;499596;499597;499598;499599;499600;526895;526896;526897;526898;526899;526900;526901;526902;526903;611864;611865;611866;611867" "45395;185900;185901;185902;185903;185904;185905;185906;185907;185908;305764;305765;322290;322291;322292;322293;322294;322295;322296;322297;322298;322299;373182;373183;373184" "45395;185907;305764;322294;373183" -1 +A1L020 A1L020 4 3 3 RNA-binding protein MEX3A MEX3A sp|A1L020|MEX3A_HUMAN RNA-binding protein MEX3A OS=Homo sapiens OX=9606 GN=MEX3A PE=1 SV=1 1 4 3 3 3 3 3 3 3 3 1 2 2 2 2 2 2 2 2 0 1 1 2 2 2 2 2 2 0 1 1 12.3 10.6 10.6 54.173 520 520 63.6 8 6 0 7.91 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By matching By matching By matching 9 9.4 9 9 9 9 1.7 6.2 6.2 106960000 14420000 9315400 18209000 18683000 9650900 17848000 0 6875500 11956000 22 4861700 655440 423430 827700 849230 438680 811260 0 312520 543460 12902000 0 14342000 13091000 0 14915000 0 0 0 1 1 1 2 1 2 0 0 0 8 "GSSNTTECVPVPTSEHVAEIVGR;TDPECPVCHITATQAIR;TPVRGEEPVFMVTGR;VVGLVVGPK" 16 "27062;68912;71627;79145" "True;True;True;False" "27317;69636;72375;79970" "278710;278711;278712;278713;278714;278715;278716;278717;712208;740332;740333;740334;740335;740336;820251;820252;820253;820254;820255;820256;820257;820258;820259" "170710;170711;432465;449780;449781;449782;449783;449784;499133" "170711;432465;449784;499133" -1 +A1L0T0 A1L0T0 10 10 10 Acetolactate synthase-like protein ILVBL sp|A1L0T0|HACL2_HUMAN 2-hydroxyacyl-CoA lyase 2 OS=Homo sapiens OX=9606 GN=ILVBL PE=1 SV=2 1 10 10 10 10 9 9 10 9 10 10 9 8 10 9 9 10 9 10 10 9 8 10 9 9 10 9 10 10 9 8 25.2 25.2 25.2 67.867 632 632 59.1 26 31 36 18 0 64.17 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 25.2 21.4 22.5 25.2 21.4 25.2 25.2 21.8 18 2347500000 257740000 324060000 286610000 328710000 240390000 270650000 223030000 236200000 180130000 25 16204000 1594600 1959000 1506200 2198300 1685100 2685100 1572000 1513600 1490000 76208000 82122000 81164000 82485000 82685000 78506000 64994000 78788000 78623000 7 11 9 11 7 9 7 7 9 77 "AAVETLGVPCFLGGMAR;DGHPVVVNILIGR;ENEDQVVK;EQVPSLGSNVACGLAYTDYHK;GALQAVDQLSLFRPLCK;HGGENVAAVLR;LSGTVGVAAVTAGPGLTNTVTAVK;LVEGLQGQTWAPDWVEELR;RPLMVLGSQALLTPTSADK;VLHDAQQQCR" 17 "681;9041;17159;18150;23508;28605;45236;46388;61251;76310" "True;True;True;True;True;True;True;True;True;True" "691;9131;17323;18319;23722;28875;45659;46820;61898;77106" "7052;7053;7054;7055;7056;7057;7058;7059;7060;92948;92949;92950;92951;92952;92953;92954;92955;92956;92957;92958;92959;92960;92961;92962;92963;92964;92965;176653;176654;176655;176656;176657;176658;176659;176660;176661;186636;186637;186638;186639;186640;186641;186642;241829;241830;241831;241832;241833;241834;241835;241836;241837;295058;295059;295060;295061;295062;295063;295064;295065;295066;469892;469893;469894;469895;469896;469897;481934;481935;481936;481937;481938;481939;481940;481941;481942;481943;481944;481945;481946;481947;481948;481949;634238;634239;634240;634241;634242;634243;634244;634245;634246;790062;790063;790064;790065;790066;790067;790068;790069;790070;790071;790072;790073;790074;790075;790076;790077;790078;790079;790080" "4291;4292;4293;4294;4295;4296;56882;56883;56884;56885;56886;56887;56888;56889;56890;108031;108032;108033;108034;108035;108036;108037;108038;108039;113965;113966;113967;113968;113969;113970;113971;113972;113973;113974;113975;113976;113977;148062;148063;148064;148065;148066;148067;180923;180924;180925;180926;180927;180928;180929;180930;180931;287523;295002;295003;295004;295005;295006;295007;295008;295009;295010;385803;385804;385805;385806;385807;385808;385809;385810;385811;480483;480484;480485;480486;480487;480488;480489;480490;480491;480492" "4294;56888;108036;113969;148066;180925;287523;295010;385805;480483" -1 +A1L188 A1L188 1 1 1 Uncharacterized protein C17orf89 C17orf89 sp|A1L188|NDUF8_HUMAN NADH dehydrogenase [ubiquinone] 1 alpha subcomplex assembly factor 8 OS=Homo sapiens OX=9606 GN=NDUFAF8 PE=1 SV=1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 14.9 14.9 14.9 7.7558 74 74 71 6 9 0.0034722 1.9414 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By matching By MS/MS 14.9 14.9 14.9 14.9 14.9 14.9 14.9 14.9 14.9 48926000 9214700 9331200 3640700 4500300 4666600 2089700 4381800 4857400 6243000 4 12231000 2303700 2332800 910180 1125100 1166700 522440 1095500 1214400 1560800 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 1 8 CVQASTAPGGR 18 7636 TRUE 7717 "78536;78537;78538;78539;78540;78541;78542;78543;78544;78545;78546;78547;78548;78549;78550" "48206;48207;48208;48209;48210;48211;48212;48213" 48213 -1 +Q9NUL7 Q9NUL7 3 3 3 Probable ATP-dependent RNA helicase DDX28 DDX28 sp|Q9NUL7|DDX28_HUMAN Probable ATP-dependent RNA helicase DDX28 OS=Homo sapiens OX=9606 GN=DDX28 PE=1 SV=2 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 5.2 5.2 5.2 59.58 540 540 65.7 6 26 9 0 4.8331 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 5.2 5.2 5.2 5.2 5.2 5.2 5.2 5.2 5.2 295950000 32522000 34661000 31436000 34660000 36233000 36381000 31762000 32837000 25458000 32 9248400 1016300 1083100 982390 1083100 1132300 1136900 992560 1026200 795550 16013000 18108000 17750000 16423000 17800000 15774000 15550000 17692000 14499000 2 2 3 2 3 3 1 3 1 20 "AQQEAPAVR;HVVCAAETGSGK;IPVALQR" 7753 "4988;30372;34097" "True;True;True" "5049;30660;34433" "51545;51546;51547;51548;51549;51550;51551;51552;51553;51554;51555;51556;51557;51558;51559;51560;51561;51562;314683;314684;314685;314686;314687;314688;314689;314690;314691;314692;314693;314694;314695;314696;353708;353709;353710;353711;353712;353713;353714;353715;353716" "31919;31920;31921;31922;31923;31924;31925;31926;31927;193245;193246;193247;193248;193249;193250;193251;193252;193253;217424;217425;217426;217427" "31927;193250;217426" -1 +Q9NUM3 Q9NUM3 1 1 1 Zinc transporter ZIP9 SLC39A9 sp|Q9NUM3|S39A9_HUMAN Zinc transporter ZIP9 OS=Homo sapiens OX=9606 GN=SLC39A9 PE=1 SV=2 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 4.9 4.9 4.9 32.251 307 307 75 8 0 6.5451 By MS/MS By MS/MS By matching By MS/MS By matching By MS/MS By MS/MS By MS/MS 4.9 4.9 4.9 4.9 4.9 0 4.9 4.9 4.9 126740000 15013000 15850000 17381000 18180000 17044000 0 11496000 17505000 14269000 5 25348000 3002600 3170000 3476200 3636000 3408900 0 2299200 3500900 2853800 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 1 1 1 6 HHQASETHNVIASDK 7754 28813 TRUE 29084 "297236;297237;297238;297239;297240;297241;297242;297243" "182276;182277;182278;182279;182280;182281" 182281 -1 +Q9NUM4 Q9NUM4 5 5 5 Transmembrane protein 106B TMEM106B sp|Q9NUM4|T106B_HUMAN Transmembrane protein 106B OS=Homo sapiens OX=9606 GN=TMEM106B PE=1 SV=2 1 5 5 5 5 5 5 3 5 4 5 4 4 5 5 5 3 5 4 5 4 4 5 5 5 3 5 4 5 4 4 20.4 20.4 20.4 31.127 274 274 56 25 9 15 0 18.493 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 20.4 20.4 20.4 12.8 20.4 16.8 20.4 16.4 16.4 1225200000 135170000 103510000 131050000 114740000 112720000 157570000 172240000 142630000 155610000 13 80997000 8662100 6737900 8357200 8271700 7496800 9510600 10828000 10050000 11081000 38761000 35058000 43345000 43518000 43372000 58054000 49874000 46818000 51573000 5 3 3 3 5 4 4 4 4 35 "EDAYDGVTSENMR;NGLVNSEVHNEDGR;SAYVSYDVQK;SLSHLPLHSSK;YQYVDCGR" 7755 "13497;50649;62631;65350;82551" "True;True;True;True;True" "13626;51203;63289;66042;83415" "139224;139225;139226;139227;139228;139229;139230;139231;139232;525794;525795;525796;525797;525798;525799;525800;525801;525802;648507;648508;648509;648510;648511;648512;648513;676309;676310;676311;676312;676313;676314;854833;854834;854835;854836;854837;854838;854839;854840;854841;854842;854843;854844;854845;854846;854847;854848;854849;854850" "84838;84839;84840;84841;84842;84843;84844;84845;84846;321602;321603;321604;321605;321606;321607;321608;321609;321610;393849;393850;393851;393852;393853;393854;410607;410608;410609;410610;520587;520588;520589;520590;520591;520592;520593;520594;520595" "84841;321608;393851;410607;520593" -1 +Q9NUN5 Q9NUN5 2 2 2 Probable lysosomal cobalamin transporter LMBRD1 sp|Q9NUN5|LMBD1_HUMAN Lysosomal cobalamin transport escort protein LMBD1 OS=Homo sapiens OX=9606 GN=LMBRD1 PE=1 SV=1 1 2 2 2 2 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 6.1 6.1 6.1 61.388 540 540 56.8 10 1 0 17.987 By MS/MS By MS/MS By MS/MS By matching By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 6.1 3.3 3.3 3.3 3.3 3.3 3.3 3.3 3.3 302540000 47671000 20351000 42345000 46679000 34898000 32325000 28997000 19563000 29716000 14 21610000 3405100 1453700 3024700 3334200 2492700 2308900 2071200 1397400 2122500 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 1 7 "LENTEDIEEVEQHIQTIK;RCDADAPEDQCTVTR" 7756 "39640;59556" "True;True" "40016;60185" "411637;411638;411639;411640;411641;411642;411643;411644;411645;411646;615321" "251898;251899;251900;251901;251902;251903;251904;375104" "251901;375104" -1 +Q9NUP1 Q9NUP1 2 2 2 Biogenesis of lysosome-related organelles complex 1 subunit 4 BLOC1S4 sp|Q9NUP1|BL1S4_HUMAN Biogenesis of lysosome-related organelles complex 1 subunit 4 OS=Homo sapiens OX=9606 GN=BLOC1S4 PE=1 SV=1 1 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 1 2 2 2 2 14.3 14.3 14.3 23.351 217 217 65 8 7 8 0 6.5249 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 14.3 14.3 14.3 14.3 8.3 14.3 14.3 14.3 14.3 392710000 55092000 52696000 49108000 34933000 47435000 60596000 21868000 46401000 24576000 9 43634000 6121300 5855100 5456400 3881500 5270600 6732900 2429800 5155700 2730700 38076000 36755000 29268000 39388000 0 47527000 24623000 33733000 23858000 1 2 1 2 1 2 1 2 1 13 "GDSSHVVSEGVPR;SAPSRPQQAGYEAPVLFR" 7757 "23930;62471" "True;True" "24150;63128" "246251;246252;246253;246254;246255;246256;246257;246258;647061;647062;647063;647064;647065;647066;647067;647068;647069;647070;647071;647072;647073;647074;647075" "150773;150774;150775;150776;150777;150778;150779;392996;392997;392998;392999;393000;393001" "150778;392997" -1 +Q9NUP7 Q9NUP7 7 7 7 tRNA:m(4)X modification enzyme TRM13 homolog TRMT13 sp|Q9NUP7|TRM13_HUMAN tRNA:m(4)X modification enzyme TRM13 homolog OS=Homo sapiens OX=9606 GN=TRMT13 PE=1 SV=2 1 7 7 7 5 5 6 5 5 5 5 5 4 5 5 6 5 5 5 5 5 4 5 5 6 5 5 5 5 5 4 19.1 19.1 19.1 54.246 481 481 62.4 7 22 17 19 0 23.318 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 15.2 15.2 16.8 15.2 15.2 15.2 15.2 15.2 12.9 383750000 45982000 39973000 40154000 33885000 55625000 34623000 44992000 46276000 42241000 29 2046500 233670 253790 914710 236570 415510 187930 241240 477750 864730 13007000 12734000 15714000 12742000 14290000 13049000 13346000 13363000 14953000 5 3 5 4 5 5 5 2 3 37 "CFVEFGAGK;DHIMSHPALHDALNDPK;FCGEHAGAAEEEDAR;HTVYEDQLAK;LQIDIQHLCLNK;RQDNQNDDSEEHDDGGYR;TSLETSNSTTK" 7758 "7045;9308;20281;30198;44416;61437;72123" "True;True;True;True;True;True;True" "7123;9405;20468;30484;44835;62087;72875" "72747;95823;95824;95825;95826;95827;95828;95829;95830;95831;95832;95833;95834;95835;95836;95837;95838;208728;208729;208730;208731;208732;208733;208734;208735;208736;312786;461841;461842;461843;461844;461845;461846;461847;461848;461849;636280;636281;636282;636283;636284;636285;636286;636287;636288;636289;636290;636291;636292;636293;636294;636295;636296;636297;745474;745475;745476;745477;745478;745479;745480;745481;745482;745483;745484" "44767;58533;58534;58535;58536;58537;127441;127442;127443;127444;127445;127446;127447;127448;191969;282662;282663;282664;282665;282666;282667;387019;387020;387021;387022;387023;387024;387025;387026;387027;452852;452853;452854;452855;452856;452857;452858" "44767;58534;127442;191969;282665;387024;452852" -1 +"Q9NUP9;O14910;Q9HAP6" Q9NUP9 "7;3;1" "7;3;1" "7;3;1" Protein lin-7 homolog C LIN7C sp|Q9NUP9|LIN7C_HUMAN Protein lin-7 homolog C OS=Homo sapiens OX=9606 GN=LIN7C PE=1 SV=1 3 7 7 7 7 6 7 6 7 7 7 7 7 7 6 7 6 7 7 7 7 7 7 6 7 6 7 7 7 7 7 43.7 43.7 43.7 21.834 197 "197;233;207" 52.8 43 17 8 9 0 22.131 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 43.7 39.1 43.7 34 43.7 43.7 43.7 43.7 43.7 3096400000 333120000 256900000 379510000 402430000 385260000 365180000 355900000 382300000 235780000 11 260060000 28794000 20686000 32202000 34292000 32425000 30394000 29712000 32666000 18887000 73384000 95491000 97413000 99045000 96197000 96231000 85582000 96080000 79879000 6 5 4 6 3 5 3 5 4 41 "ATVAAFAASEGHSHPR;EQNSPIYISR;GDQLLSVNGVSVEGEHHEK;IIPGGIADR;TEEGLGFNIMGGK;VLEEMESR;VLQSEFCNAVR" 7759 "5959;18035;23914;32627;69097;76126;76590" "True;True;True;True;True;True;True" "6030;18203;24134;32935;69822;76922;77389" "61801;61802;61803;61804;61805;61806;61807;61808;61809;185437;185438;185439;185440;185441;185442;185443;185444;185445;185446;185447;185448;185449;185450;185451;185452;185453;246084;246085;246086;246087;246088;246089;246090;246091;246092;338370;338371;338372;338373;338374;338375;338376;338377;714092;714093;714094;714095;714096;714097;714098;714099;714100;788102;788103;788104;788105;788106;788107;788108;788109;788110;788111;788112;788113;788114;788115;788116;788117;793043;793044;793045;793046;793047;793048;793049;793050;793051" "38043;38044;38045;113250;113251;113252;113253;113254;113255;113256;113257;150675;207967;207968;207969;207970;433628;433629;433630;433631;433632;433633;433634;433635;433636;433637;479238;479239;479240;479241;479242;479243;482412;482413;482414;482415;482416;482417;482418;482419;482420" "38045;113256;150675;207968;433637;479242;482412" "-1;-1;-1" +Q9NUQ2 Q9NUQ2 6 6 6 1-acyl-sn-glycerol-3-phosphate acyltransferase epsilon AGPAT5 sp|Q9NUQ2|PLCE_HUMAN 1-acyl-sn-glycerol-3-phosphate acyltransferase epsilon OS=Homo sapiens OX=9606 GN=AGPAT5 PE=1 SV=3 1 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 20.1 20.1 20.1 42.072 364 364 53.9 25 18 18 0 20.905 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 20.1 20.1 20.1 20.1 20.1 20.1 20.1 20.1 20.1 868450000 115300000 96225000 108150000 86153000 81322000 107620000 98758000 84412000 90499000 19 24333000 3444900 3241900 2698300 2431500 2330200 2502700 3226500 1409800 3047300 25458000 25518000 24604000 22609000 22435000 25251000 24366000 23720000 22905000 6 5 5 4 6 6 5 4 5 46 "ATHVAFDCMK;DVPEEQEHMR;ESPTMTEFLCK;LLSAFLPAR;LQSYVDAGTPMYLVIFPEGTR;VLSASQAFAAQR" 7760 "5792;12289;18657;42672;44711;76608" "True;True;True;True;True;True" "5862;12411;18831;43077;45133;77407" "60101;60102;60103;60104;60105;60106;60107;60108;60109;60110;60111;60112;60113;126628;126629;126630;126631;126632;126633;126634;126635;126636;192113;192114;192115;192116;192117;192118;192119;192120;192121;443603;443604;443605;443606;443607;443608;443609;443610;443611;443612;443613;443614;464767;464768;464769;464770;464771;464772;464773;464774;464775;793236;793237;793238;793239;793240;793241;793242;793243;793244" "37047;37048;37049;37050;77091;77092;77093;77094;77095;77096;77097;77098;77099;117249;117250;117251;117252;117253;117254;117255;117256;117257;271597;271598;271599;271600;271601;271602;271603;271604;284475;284476;284477;284478;284479;284480;284481;284482;482560;482561;482562;482563;482564;482565;482566;482567;482568" "37050;77099;117257;271599;284477;482566" -1 +Q9NUQ3 Q9NUQ3 10 10 10 Gamma-taxilin TXLNG sp|Q9NUQ3|TXLNG_HUMAN Gamma-taxilin OS=Homo sapiens OX=9606 GN=TXLNG PE=1 SV=2 1 10 10 10 7 8 7 6 8 8 8 10 8 7 8 7 6 8 8 8 10 8 7 8 7 6 8 8 8 10 8 23.9 23.9 23.9 60.585 528 528 51.7 52 24 20 0 53.65 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 15.7 20.6 15.7 15.7 17.4 20.5 20.6 23.9 17.4 7794800000 253180000 325880000 358660000 315720000 310180000 400070000 2073700000 2317400000 1440000000 25 255860000 4779200 6187800 9266100 6506700 6258000 8677300 77479000 83228000 53482000 553400000 556190000 599870000 662930000 676450000 697340000 477020000 696740000 492220000 7 8 6 6 7 7 9 9 8 67 "ADMLCNSQSNDILQHQGSNCGGTSNK;DLATPVMQPCTALDSHK;EENMQQAR;GGGAEEATEAGR;HSLEEDEGSDFITENR;LQQTTQLIK;LRQENIELGEK;SNELFTTFR;VHLQSEHSK;YADLLEESR" 7761 "1063;9838;14410;24609;29937;44646;44929;65670;75447;80640" "True;True;True;True;True;True;True;True;True;True" "1076;9937;14542;24843;30222;45067;45352;66372;76234;81483" "10933;10934;10935;10936;10937;101430;101431;101432;101433;101434;101435;101436;101437;101438;101439;101440;101441;101442;101443;148508;148509;253623;253624;253625;253626;253627;253628;253629;253630;253631;253632;253633;253634;253635;253636;253637;253638;253639;253640;253641;253642;253643;253644;309985;309986;309987;309988;309989;309990;309991;309992;309993;309994;309995;309996;309997;309998;309999;310000;464151;464152;464153;464154;464155;464156;464157;464158;464159;466876;466877;466878;466879;466880;466881;466882;466883;466884;679666;679667;679668;679669;679670;679671;679672;679673;679674;780733;780734;780735;835185;835186;835187;835188;835189;835190;835191" "6756;6757;6758;6759;6760;6761;61861;61862;61863;61864;61865;61866;61867;61868;61869;61870;61871;90481;90482;155445;155446;155447;155448;155449;155450;155451;155452;155453;155454;155455;190268;190269;190270;190271;190272;190273;190274;190275;190276;284100;284101;284102;284103;284104;284105;284106;284107;284108;285722;285723;285724;285725;285726;285727;285728;285729;285730;412500;412501;412502;412503;412504;412505;412506;474807;508436;508437;508438;508439;508440;508441;508442" "6758;61867;90481;155452;190275;284101;285723;412500;474807;508438" -1 +Q9NUQ7 Q9NUQ7 8 8 8 Ufm1-specific protease 2 UFSP2 sp|Q9NUQ7|UFSP2_HUMAN Ufm1-specific protease 2 OS=Homo sapiens OX=9606 GN=UFSP2 PE=1 SV=3 1 8 8 8 8 8 8 8 8 8 8 7 7 8 8 8 8 8 8 8 7 7 8 8 8 8 8 8 8 7 7 22.6 22.6 22.6 53.261 469 469 64 27 43 18 0 18.944 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 22.6 22.6 22.6 22.6 22.6 22.6 22.6 20.7 19.2 1287200000 144710000 149640000 156980000 157870000 155370000 145940000 129930000 111190000 135530000 23 15329000 2024900 1784600 1557000 1630200 1739300 2064800 1706700 833770 1987800 52783000 50585000 46710000 51590000 49638000 50340000 46612000 46647000 49041000 8 7 6 6 7 7 6 6 7 60 "EIQQALVDAGDKPATFVGSR;ELHDLFNLPHDRPYFK;GTSIVVPEPLHFLLPGK;HVLSDLSTK;LLVDAIHNQLTDMEK;LSSNALVFR;NLVTISYPSGIPDGQLQAYR;PATFVGSR" 7762 "15907;16444;27332;30318;42857;45570;51528;53591" "True;True;True;True;True;True;True;True" "16054;16598;27591;30606;43262;45994;52091;54180" "163853;163854;163855;163856;163857;163858;163859;163860;163861;163862;163863;163864;163865;163866;163867;163868;163869;169386;169387;169388;169389;169390;169391;169392;169393;281575;281576;281577;281578;281579;281580;281581;281582;281583;281584;281585;281586;281587;281588;281589;281590;281591;314076;314077;314078;314079;314080;314081;314082;314083;314084;445523;445524;445525;445526;445527;445528;445529;445530;445531;473252;473253;473254;473255;473256;473257;473258;473259;473260;473261;534562;534563;534564;534565;534566;534567;534568;534569;534570;555811;555812;555813;555814;555815;555816;555817;555818;555819" "99963;99964;99965;99966;99967;99968;99969;99970;103526;103527;103528;172351;172352;172353;172354;172355;172356;172357;172358;172359;192842;192843;192844;192845;192846;192847;192848;192849;192850;272713;272714;272715;272716;272717;272718;272719;272720;289642;289643;289644;289645;289646;289647;289648;326983;326984;326985;326986;326987;326988;326989;339950;339951;339952;339953;339954;339955;339956;339957;339958" "99970;103527;172355;192849;272719;289643;326987;339956" -1 +Q9NUQ8 Q9NUQ8 15 15 15 ATP-binding cassette sub-family F member 3 ABCF3 sp|Q9NUQ8|ABCF3_HUMAN ATP-binding cassette sub-family F member 3 OS=Homo sapiens OX=9606 GN=ABCF3 PE=1 SV=2 1 15 15 15 13 13 14 11 11 13 11 9 10 13 13 14 11 11 13 11 9 10 13 13 14 11 11 13 11 9 10 23.8 23.8 23.8 79.744 709 709 57.1 51 29 27 26 0 53.076 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 22.6 22.6 22.6 20.2 20.7 22.4 21.2 17.3 20.2 3713000000 460090000 385240000 380550000 498280000 471560000 442020000 358720000 358370000 358160000 33 79468000 9986400 8877700 8275600 10374000 10399000 9405600 7203000 7639400 7307000 170190000 142210000 139410000 162100000 138790000 131360000 119720000 142230000 136600000 10 10 13 9 9 9 9 6 8 83 "ALFARPDLLLLDEPTNMLDVR;EQSSTVNAK;ERELTAQIAAGR;EYEAQQQYR;IENFDVSFGDR;ITENYDCGTK;LEEIEADK;LLLGDLAPVR;LSVSADLESR;MQQQPTR;REQSSTVNAK;VEGGFDQYR;VPAHISLLHVEQEVAGDDTPALQSVLESDSVR;YGISGELAMR;YGISGELAMRPLASLSGGQK" 7763 "3401;18106;18235;19835;31568;35057;39355;42345;45684;49046;59924;74463;77255;81400;81401" "True;True;True;True;True;True;True;True;True;True;True;True;True;True;True" "3443;18275;18404;20017;31864;35398;39730;42748;46108;49568;60559;75238;78068;82252;82253" "35138;35139;35140;35141;35142;35143;35144;35145;35146;186179;186180;186181;186182;186183;187496;187497;187498;187499;187500;187501;187502;187503;187504;187505;187506;187507;187508;187509;204136;327219;327220;327221;327222;327223;327224;327225;327226;327227;363708;363709;363710;363711;363712;363713;363714;363715;363716;363717;363718;363719;363720;363721;363722;363723;408627;408628;408629;408630;408631;408632;408633;408634;408635;408636;408637;408638;408639;408640;408641;408642;440144;440145;440146;440147;440148;440149;440150;440151;440152;440153;440154;440155;440156;440157;440158;440159;474481;474482;474483;474484;509499;509500;509501;509502;509503;509504;619647;619648;619649;619650;619651;619652;619653;619654;770121;770122;770123;770124;770125;770126;770127;770128;770129;800195;800196;800197;800198;800199;800200;800201;800202;800203;842879;842880;842881;842882;842883;842884;842885;842886;842887;842888;842889" "21807;21808;21809;21810;21811;21812;21813;21814;113700;113701;113702;113703;114473;114474;114475;114476;114477;114478;124570;200948;200949;200950;200951;200952;200953;200954;200955;223370;223371;223372;223373;223374;223375;223376;223377;223378;250125;250126;250127;250128;250129;250130;250131;269543;269544;269545;269546;269547;269548;269549;290393;290394;290395;290396;311744;311745;311746;311747;311748;311749;377505;377506;377507;377508;377509;377510;377511;468016;468017;468018;468019;468020;468021;468022;468023;468024;486950;486951;486952;513127;513128;513129;513130;513131;513132;513133" "21807;113700;114475;124570;200951;223372;250125;269548;290393;311749;377510;468024;486951;513127;513129" -1 +Q9NUQ9 Q9NUQ9 16 16 15 Protein FAM49B FAM49B sp|Q9NUQ9|CYRIB_HUMAN CYFIP-related Rac1 interactor B OS=Homo sapiens OX=9606 GN=CYRIB PE=1 SV=1 1 16 16 15 16 15 16 16 14 15 16 16 14 16 15 16 16 14 15 16 16 14 15 14 15 15 13 14 15 15 13 65.1 65.1 62.3 36.748 324 324 57.5 73 54 68 31 0 115.93 By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS By MS/MS 65.1 63 65.1 65.1 59.3 63 65.1 65.1 60.8 33749000000 3501000000 4005100000 3711200000 3686700000 3837400000 3857400000 3990100000 3558600000 3601300000 19 932870000 101670000 101150000 99892000 100930000 100600000 95314000 114600000 110460000 108270000 697130000 709170000 681070000 685360000 699780000 669940000 731560000 641780000 675980000 16 18 16 13 12 16 14 16 15 136 "AWGAVVPLVGK;DAEGILEDLQSYR;DQPPNSVEGLLNALR;EAIQHPADEK;EIYNQVNVVLK;FTNEETVSFCLR;FYEFSQR;GAGHEIR;GLLGALTSTPYSPTQHLER;HLNDETTSK;INNVPAEGENEVNNELANR;MTNPAIQNDFSYYR;TLSDATTK;VLTCTDLEQGPNFFLDFENAQPTESEK;VMLETPEYR;VMVGVIILYDHVHPVGAFAK" 7764 "6635;7764;11186;12986;16020;22817;23198;23426;25622;29198;33719;49336;70816;76696;76905;76959" "True;True;True;True;True;True;True;True;True;True;True;True;True;True;True;True" "6709;7845;11301;13111;16170;23024;23409;23638;25864;29475;34051;49866;71559;77495;77706;77767" "68811;68812;68813;68814;68815;68816;68817;68818;68819;68820;68821;68822;68823;68824;68825;68826;68827;68828;79888;79889;79890;79891;79892;79893;79894;79895;79896;79897;79898;79899;79900;79901;79902;79903;79904;115444;115445;115446;115447;115448;115449;115450;115451;115452;115453;115454;115455;115456;115457;115458;115459;115460;133893;133894;133895;133896;133897;133898;133899;133900;133901;133902;133903;133904;165140;165141;165142;165143;165144;165145;165146;165147;165148;234684;234685;234686;234687;234688;234689;234690;234691;238667;238668;238669;238670;238671;238672;238673;238674;240984;240985;240986;240987;240988;264300;264301;264302;264303;264304;264305;264306;264307;264308;264309;264310;264311;264312;264313;264314;264315;264316;301580;301581;301582;301583;301584;301585;301586;301587;301588;301589;301590;301591;301592;301593;301594;301595;301596;301597;301598;301599;301600;301601;349867;349868;349869;349870;349871;349872;349873;349874;349875;349876;349877;349878;349879;349880;349881;349882;349883;349884;512342;512343;512344;512345;512346;512347;512348;512349;512350;512351;512352;512353;512354;512355;512356;512357;732130;732131;732132;732133;732134;732135;732136;732137;732138;794188;794189;794190;794191;794192;794193;794194;794195;794196;794197;794198;794199;794200;794201;794202;794203;796374;796375;796376;796377;796378;796379;796380;796381;796382;796383;796384;796385;796386;796387;796388;796389;796390;796391;797061;797062;797063;797064;797065;797066;797067;797068;797069;797070;797071;797072;797073;797074;797075;797076" "42289;42290;42291;42292;42293;42294;42295;42296;42297;42298;42299;49015;49016;49017;49018;49019;49020;49021;70412;70413;70414;70415;70416;70417;70418;70419;70420;81619;81620;81621;81622;81623;81624;81625;81626;81627;81628;81629;81630;100825;100826;100827;100828;100829;100830;100831;100832;100833;143596;143597;143598;143599;143600;143601;143602;143603;146064;146065;146066;146067;146068;146069;146070;146071;147522;147523;161938;161939;161940;161941;161942;161943;161944;161945;161946;161947;161948;185083;185084;185085;185086;185087;185088;185089;185090;185091;185092;185093;185094;185095;185096;185097;185098;185099;185100;185101;185102;185103;215029;215030;215031;215032;215033;215034;215035;215036;215037;215038;215039;215040;313397;313398;313399;313400;313401;313402;313403;313404;313405;313406;444618;483145;483146;483147;483148;483149;483150;483151;483152;483153;484483;484484;484485;484486;484487;484488;484489;484944;484945;484946;484947;484948;484949;484950;484951;484952;484953" "42294;49021;70416;81622;100830;143600;146064;147522;161938;185093;215033;313403;444618;483146;484485;484953" -1 From b3dc7ed7f86df032aeb0b087ad13a6f389c9488b Mon Sep 17 00:00:00 2001 From: "David M. Hollenstein" Date: Thu, 19 Sep 2024 18:21:37 +0200 Subject: [PATCH 3/9] docs: Update readme file --- README.md | 97 +++++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 73 insertions(+), 24 deletions(-) diff --git a/README.md b/README.md index e772421..2563be9 100644 --- a/README.md +++ b/README.md @@ -1,55 +1,104 @@ -[![Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.](https://www.repostatus.org/badges/latest/wip.svg)](https://www.repostatus.org/#wip) - # XlsxReport +[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) +![Python Version from PEP 621 TOML](https://img.shields.io/python/required-version-toml?tomlFilePath=https%3A%2F%2Fraw.githubusercontent.com%2Fhollenstein%2Fprofasta%2Fmain%2Fpyproject.toml) +[![pypi](https://img.shields.io/pypi/v/xlsxreport)](https://pypi.org/project/xlsxreport) + +**XlsxReport** is a Python library that automates the creation of formatted Excel reports from tabular data. + + +## Table of Contents -## Introduction +- [What is XlsxReport?](#what-is-xlsxreport) +- [Getting Started with a simple example](#getting-started-with-a-simple-example) +- [Installation](#installation) + - [Setting up the application data directory](#setting-up-the-application-data-directory) + - [Installation when using Anaconda](#installation-when-using-anaconda) +- [Upcoming features and work in progress](#upcoming-features-and-work-in-progress) -XlsxReport is a Python library that simplifies the creation of well-formatted Excel reports from CSV files of quantitative mass spectrometry (MS) results. It utilizes YAML template files to specify the arrangement and formatting of the CSV content in the resulting Excel file. -With XlsxReport, generating Excel reports for mass spectrometry results from the same software or pipeline is a breeze – just create a YAML report template file once and execute a command line script to create reproducibly formatted Excel reports whenever needed. +## What is XlsxReport? -The two main applications of XlsxReport are to create clean and uncluttered Excel files for the manual inspection of MS results, and to create Excel reports that can be used as supplementary tables for publications. +Well-formatted Excel reports are important for presenting and sharing data in a clear and structured manner with collaborators, in publications, and for the manual inspection of results. However, creating these reports manually is time-consuming, tedious, and has to be repeated for every new dataset and analysis. XlsxReport was developed to streamline the process of turning tabular data into formatted Excel reports. By automating this task, XlsxReport allows the creation of consistent, publication-ready Excel reports with minimal effort. -## Release +XlsxReport uses YAML template files to define the content, structure, and formatting of the generated Excel reports. The library provides a command-line interface and a Python API, allowing users to create Excel reports by applying table templates to tabular data. Although XlsxReport has been developed for quantitative mass spectrometry data, its versatile design makes it suitable for any type of tabular data. -XlsxReport is actively developed and currently in late alpha stage. +XlsxReport is actively developed as part of the computational toolbox for the [Mass Spectrometry Facility](https://www.maxperutzlabs.ac.at/research/facilities/mass-spectrometry-facility) at the Max Perutz Labs (University of Vienna). +## Getting started -## Install +With XlsxReport, generating reproducibly formatted Excel reports from your data analysis pipeline is a breeze - simply create a YAML table template once and execute a single terminal command to create Excel reports whenever needed. -If you do not already have a Python installation, we recommend installing the [Anaconda distribution](https://www.continuum.io/downloads) of Continuum Analytics, which already contains a large number of popular Python packages for Data Science. Alternatively, you can also get Python from the [Python homepage](https://www.python.org/downloads/windows). XlsxReport requires Python version 3.9 or higher. +Give it a try by using the provided example files in the `examples` directory. The `examples` directory contains a "proteinGroups.txt" file from MaxQuant, which can be turned into a formatted Excel report with the included default table template file "maxquant.yaml". -You can use pip to install XlsxReport from the distribution file with the following command: +After installing XlsxReport and setting up the application data directory as described below, you can create an Excel report by running the following command in the terminal: +```shell +xlsxreport compile examples/proteinGroups.txt maxquant.yaml ``` -pip install xlsxreport-X.Y.Z-py3-none-any.whl + +This command will create an Excel file named "proteinGroups.report.xlsx" in the same directory as the input file. The Excel file contains the data from the input file formatted according to the instructions in the table template. + +You can achieve the same result using the Python API with the following code: + +```python +import pandas as pd +import xlsxreport + +template_path = xlsxreport.get_template_path("maxquant.yaml") +template = xlsxreport.TableTemplate.load(template_path) +table = pd.read_csv("./examples/proteinGroups.txt", sep="\t") +with xlsxreport.ReportBuilder("./examples/proteinGroups.report.xlsx") as builder: + builder.add_report_table(table, template, tab_name="Report") ``` -To uninstall the XlsxReport package type: +> _**NOTE:** The `xlsxreport compile` command and the `xlsxreport.get_template_path` Python function will initially verify if a valid file path for the table template is provided. If the table template file is not found, the application data directory will be searched. This feature allows you to store your default table templates in the application data directory and use them without specifying the full path._ + + +## Installation +If you do not already have a Python installation, we recommend installing the [Anaconda distribution](https://www.anaconda.com/download) or [Miniconda](https://docs.anaconda.com/free/miniconda/index.html) distribution from Continuum Analytics, which already contains a large number of popular Python packages for Data Science. Alternatively, you can also get Python from the [Python homepage](https://www.python.org/downloads/windows). Note that XlsxReport requires Python version 3.9 or higher. + +The following command will install the latest version of XlsxReport and its dependencies from PyPi, the Python Packaging Index: + +```shell +pip install xlsxreport ``` + +To uninstall the XlsxReport library use: + +```shell pip uninstall xlsxreport ``` -### Installation when using Anaconda - -If you are using Anaconda, you will need to install the XlsxReport package into a conda environment. Open the Anaconda navigator, activate the conda environment you want to use, run the "CMD.exe" application to open a terminal, and then use the pip install command as described above. +### Setting up the application data directory +After XlsxReport has been installed you should create the local application data directory, which enables more convenient access to your default table templates. Running the following command creates a new XlsxReport folder in the local user application data directory, for example "C:/User/user_name/AppData/Local/XlsxReport" on Windows 10, and copies the default table templates that are included with XlsxReport: -### Setting up the AppData directory +```shell +xlsxreport appdir --setup +``` -After XlsxReport has been installed the local AppData directory needs to be setup and the default template files need to be copied. Running the `xlsxreport setup` script creates a new XlsxReport folder in the local user app data directory, for example "C:/User/user_name/AppData/Local/XlsxReport" on Windows 10, and copies the default template files there. +To view the path to the application data directory, you can run the following command: +```shell +xlsxreport appdir ``` -xlsxreport setup + +Including the `--reveal` flag will open the application data directory in the file explorer: + +```shell +xlsxreport appdir --reveal ``` -## Run a script +### Installation when using Anaconda -To generate a simple excel protein report, run the `xlsxreport report` script with an input and template file. Here is an example with the default maxquant.yaml template file. +To install the XlsxReport package using Anaconda, you need to either activate a custom conda environment or install it into the default base environment. Open the Anaconda Navigator, activate the desired conda environment or use the base environment, and then open a terminal by running the "CMD.exe" application. Finally, use the `pip install` command as previously before. -``` -xlsxreport report proteinGroups.txt maxquant.yaml -``` + +## Upcoming features and work in progress + +The library has reached a stable state and we are currently working on **extending the documentation** and adding **minor feature enhancements**. In addition, we are planning to also release a **simple GUI** for creating Excel reports that provides the same functionality as the command-line interface. + +If you have any feature requests, suggestions, or bug reports, please feel free to open an issue on the [GitHub issue tracker](https://github.com/hollenstein/xlsxreport/issues). \ No newline at end of file From 8cc7ca795463abf0e3c567373e946d0ef86f6b46 Mon Sep 17 00:00:00 2001 From: "David M. Hollenstein" Date: Fri, 20 Sep 2024 12:02:24 +0200 Subject: [PATCH 4/9] chore: Add GitHub actions workflow for running pytest --- .github/workflows/run-pytest.yml | 30 ++++++++++++++++++++++++++++++ README.md | 3 ++- 2 files changed, 32 insertions(+), 1 deletion(-) create mode 100644 .github/workflows/run-pytest.yml diff --git a/.github/workflows/run-pytest.yml b/.github/workflows/run-pytest.yml new file mode 100644 index 0000000..0703cdc --- /dev/null +++ b/.github/workflows/run-pytest.yml @@ -0,0 +1,30 @@ +# This workflow will install the xlsxreport package and its dependencies and run pytest with a variety of Python versions +# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python + +name: Run pytest +on: + push: + branches: ["develop", "feature/*", "main", "release/*"] + pull_request: + +jobs: + build: + runs-on: ubuntu-latest + strategy: + fail-fast: false + matrix: + python-version: ["3.9", "3.10", "3.11", "3.12"] + + steps: + - uses: actions/checkout@v4.1.2 + - name: Set up Python ${{ matrix.python-version }} + uses: actions/setup-python@v5.1.0 + with: + python-version: ${{ matrix.python-version }} + - name: Install dependencies + run: | + python -m pip install --upgrade pip + pip install .[tests] + - name: Test with pytest + run: | + python -m pytest diff --git a/README.md b/README.md index 2563be9..b8e2704 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,7 @@ [![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) ![Python Version from PEP 621 TOML](https://img.shields.io/python/required-version-toml?tomlFilePath=https%3A%2F%2Fraw.githubusercontent.com%2Fhollenstein%2Fprofasta%2Fmain%2Fpyproject.toml) [![pypi](https://img.shields.io/pypi/v/xlsxreport)](https://pypi.org/project/xlsxreport) +[![Run pytest](https://github.com/hollenstein/xlsxreport/actions/workflows/run-pytest.yml/badge.svg?branch=main)](https://github.com/hollenstein/xlsxreport/actions/workflows/run-pytest.yml) **XlsxReport** is a Python library that automates the creation of formatted Excel reports from tabular data. @@ -24,7 +25,7 @@ XlsxReport uses YAML template files to define the content, structure, and format XlsxReport is actively developed as part of the computational toolbox for the [Mass Spectrometry Facility](https://www.maxperutzlabs.ac.at/research/facilities/mass-spectrometry-facility) at the Max Perutz Labs (University of Vienna). -## Getting started +## Getting Started with a simple example With XlsxReport, generating reproducibly formatted Excel reports from your data analysis pipeline is a breeze - simply create a YAML table template once and execute a single terminal command to create Excel reports whenever needed. From 64f09436e0bb2faa5636c9a6d6b315d83ef7e7ab Mon Sep 17 00:00:00 2001 From: David Hollenstein Date: Fri, 20 Sep 2024 13:58:49 +0200 Subject: [PATCH 5/9] docs: Add LICENSE file --- LICENSE | 201 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 201 insertions(+) create mode 100644 LICENSE diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..261eeb9 --- /dev/null +++ b/LICENSE @@ -0,0 +1,201 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. From 33b42c6e75170459b811b91f290fbbb3238a5548 Mon Sep 17 00:00:00 2001 From: "David M. Hollenstein" Date: Fri, 20 Sep 2024 17:03:16 +0200 Subject: [PATCH 6/9] docs: Update documentation --- DOCUMENTATION.md | 412 +++++++++++++++++++++++++++++++++++------------ 1 file changed, 309 insertions(+), 103 deletions(-) diff --git a/DOCUMENTATION.md b/DOCUMENTATION.md index 85f9af0..9f26766 100644 --- a/DOCUMENTATION.md +++ b/DOCUMENTATION.md @@ -1,36 +1,59 @@ # XlsxReport -## Introduction +The documentation for XlsxReport is still work in progress. This file provides an overview of how XlsxReport works and a detailed description of the table template and its formatting options. -XlsxReport is a Python library that simplifies the creation of well-formatted Excel reports from CSV files of quantitative mass spectrometry (MS) results. It utilizes YAML template files to specify the arrangement and formatting of the CSV content in the resulting Excel file. +## Table of contents +- [How does **XlsxReport** work and what can it do?](#how-does-xlsxreport-work-and-what-can-it-do) + - [What **XlsxReport** can't do](#what-xlsxreport-cant-do) +- [How does the table template look like](#how-does-the-table-template-look-like) +- [The `sections` area of the table template](#the-sections-area-of-the-table-template) + - [1) The standard template section](#1-the-standard-template-section) + - [2) Tag template section](#2-tag-template-section) + - [3) Label tag template section](#3-label-tag-template-section) + - [4) Comparison template section](#4-comparison-template-section) + - [5) Common optional template section parameters](#5-common-optional-template-section-parameters) +- [The `formats` area of the table template](#the-formats-area-of-the-table-template) +- [The `conditional_formats` are of the table template](#the-conditional_formats-are-of-the-table-template) +- [The `settings` area of the table template](#the-settings-area-of-the-table-template) -With XlsxReport, generating Excel reports for mass spectrometry results from the same software or pipeline is a breeze – just create a YAML report template file once and execute a command line script to create reproducibly formatted Excel reports whenever needed. -The two main applications of XlsxReport are to create clean and uncluttered Excel files for the manual inspection of MS results, and to create Excel reports that can be used as supplementary tables for publications. +## How does XlsxReport work and what can it do? +To generate the formatted Excel report, XlsxReport requires tabular data, for example a CSV file, and a table template, a YAML file that contains instructions for the structure and formatting of the generated Excel report. The table template allows specifying which columns appear in the Excel file, the order of the columns, and which columns will be grouped together into `sections`. Furthermore, the format of headers can be specified, and individual formats and conditional formats can be applied to the content of each column. Moreover, it is possible to specify `section` supheaders that will be written above the header row into a merged cell. -## The XlsxReport report template document +> _**Note**: The term `section` or `template section` refers to a group of columns that are defined in the table template as a unit, and that are written to the Excel sheet as a block, or as a "section"._ -To generate the formatted Excel report, XlsxReport requires an input CSV file and a report template in YAML format. The report template is used to describe the structure and formatting of the generated Excel report. This allows specifying which columns should appear in the Excel file, the order of the columns, and which columns will be grouped together into sections. The report template file allows specifying the format of headers, and applying individual formats and conditional formats to the content of each column. Moreover, it is possible to specify section supheaders that will be written above the header row into a merged cell. -It is not possible to use the report template for renaming column headers, applying calculations to column values, and for sorting rows. In general, anything that changes the data is not the scope of XlsxReport, if such a functionality is required it should be implemented in another script that can be run before XlsxReport. +### What XlsxReport can't do +In general, changing values, i.e. the content of columns, is beyond the scope of XlsxReport. Specifically, it is not possible to use the table template for applying calculations to columns or for filtering and sorting rows. If such functionality is required it should be implemented in another script that can be run before XlsxReport. The only exception to this is the possibility to apply a log2 transformation on the values of a `tag section` or `label tag section`. -### How does the report template file look like +In addition, directly renaming column headers in the `standard section` is not yet supported, but is planned for a future release. -The report template file comprises four areas named `sections`, `formats`, `conditional_formats`, and `settings`. The `sections` area is used to select and organize columns, and to specify their formatting by assigning formats and conditional formats that are defined in the `formats` and `conditional_formats` areas. For example, a format determines decimal digits or alignment, whereas conditional formats define cell appearance based on values. The `settings` area is used to define general settings like row height, whether to apply an autofilter on the header row, or if a section supheader row should be added. -Here is a simple example of a report template that is used to generate an Excel file with tree columns: "Protein ID", "Gene name", and "Spectral count". It contains only one entry, "protein_evidence", in the `sections` area. In the "protein_evidence" section three columns are selected and a default format "str" is applied to all column values. In addition, the "int" format and the "count" conditional format are specifically applied to the values of the "Spectral count" column, overriding the defaults. Finally, a supheader "Protein evidence" is defined, which will be written to the excel above the header row. Writing supheader is enabled because the `settings` area contains the entry "write_supheader: True". In the `formats` area the two formats "int" and "str" are defined that have been referenced in the "protein_evidence" section. In addition, the format specified by the "header" and "supheader" entries are applied to the header and supheader row. The `conditional_formats` area contains one conditional format called "count", which has been assigned to the "Spectral count" column in the "protein_evidence" section. +## How does the table template look like + +The table template comprises four areas named `sections`, `formats`, `conditional_formats`, and `settings`. Each of these is encoded as a mapping in the YAML file: + +```YAML +sections: {} +formats: {} +conditional_formats: {} +settings: {} +``` + +The `sections` area is used to select and organize columns, and to specify their formatting by assigning formats and conditional formats that are defined in the `formats` and `conditional_formats` areas. For example, a format can specify the number of decimal digits that are displayed or the text alignment, whereas conditional formats define cell appearance based on values. Basically, formats and conditional formats are defined by using the same parameters as in Excel. Finally, the `settings` area of the table template is used to define general settings, such as row height, whether to apply an autofilter on the header row, or if an additional row for `section` supheaders should be added. + +Here is a basic example of a table template that generates an Excel file with three columns "Protein IDs", "Gene names", and "Unique peptides". ```YAML sections: - protein_evidence: { - columns: ["Protein ID", "Gene name", "Spectral count"], - column_format: {"Spectral count": "int"}, - column_conditional: {"Spectral count": "count"}, - format: "str", - supheader: "Protein evidence", - } + protein_evidence: + columns: ["Protein IDs", "Gene names", "Unique peptides"] + column_format: {"Unique peptides": "int"} + column_conditional_format: {"Unique peptides": "count"} + format: "str" + supheader: "Protein evidence" formats: int: {"align": "center", "num_format": "0"} @@ -39,154 +62,337 @@ formats: supheader: {"bold": True, "align": "center", "bottom": 2} conditional_formats: - count: { - "type": "2_color_scale", - "min_type": "num", "min_value": 0, "min_color": "#ffffbf", - "max_type": "percentile", "max_value": 99.5, "max_color": "#f25540" - } + count: + "type": "2_color_scale" + "min_type": "num" + "min_value": 0 + "min_color": "#ffffbf" + "max_type": "percentile" + "max_value": 99.5 + "max_color": "#f25540" settings: write_supheader: True - add_autofilter: True - header_height: 95 ``` +This table template example contains only one `template section` with the internal name "protein_evidence". The `columns` keyword is used to select which columns will appear in the Excel report. Using the argument `format` the default format "str" is applied to all columns within the `template section`. In addition, the "int" format and the "count" conditional format are specifically applied to the values of the "Unique peptides" column, overriding the defaults. Finally, a supheader "Protein evidence" is defined, which will be written to the Excel sheet above the header row. Writing a supheader row must be specifically enabled, which is done in the `settings` area with the parameter "write_supheader: True". In the `formats` area the two formats "int" and "str" are defined that have been referenced in the "protein_evidence" `template section`. In addition, the format specified with the name "header" and "supheader" are special formats that are always applied to the header and supheader rows. The `conditional_formats` area contains the description of the conditional format called "count", which has been assigned to the "Unique peptides" column. + -### The template section area - `sections` +## The `sections` area of the table template -Each entry in the `sections` area is defined by a unique name and contains a set of parameters that describe a group of columns that will be written to the excel file as a section. There are currently three different categories of template sections, each provides a different way how the columns for the section are selected. In addition, the parameters specified in a `template section` describe how the column values and headers will be formatted, if conditional formats are applied, and other settings. The order of `template sections` in the template file determines the order in which the sections, and thus the columns, are written to the excel file. +Each entry in the `sections` area is defined by a unique internal name and contains parameters that are used to select one or multiple columns, and parameters that define how the columns are formatted in the Excel report. There are currently three different types of template `sections`, each provides a different way how the columns for the section are selected. In addition, the parameters specified in a `template section` describe how the column values and headers will be formatted, if conditional formats are applied, and other settings. The order of `template sections` in the table template determines the order in which the sections, and thus the columns, are written to the Excel file. -#### Default sections -In a `default section` columns are directly selected by specifying a list of column names with the `columns` parameter. The specified order of columns defines in which order the columns will be written to the Excel sheet. Formats and conditional formats can be applied to the whole section or to individual columns. The parameters `tag` is not allowed in this section. The parameters `log2`, `replace_comparison_tag`, and `remove_tag` have no effect on this section type. -##### Additional section parameters -- Required: `columns: list[str]`
---- *Note: Description missing* --- +### 1) The standard template section -#### Tag sections -In a `tag section`, columns are not directly specified with a `columns` parameter but rather by specifying a `tag` that allows the selection of columns containing a specific string, but that also have a part of the column name different in each CSV file. The `tag` is used as a regular expression pattern to find matching colummns. This allows for example to create a section containing all "Intensity" columns, irrespective of how the samples are named. +When using a `standard template section`, you can directly select the columns that should be written to the Excel sheet by specifying a list of column names. This is useful when column names are constant in your output and don’t change between experiments, such as a "Protein IDs" column. Formats and conditional formats can be applied to the whole `section` or to individual columns. -##### Additional section parameters -- Required: `tag: str`
---- *Note: Description missing* --- +#### Required and optional section parameters -- Optional: `remove_tag: bool`
---- *Note: Description missing* --- +- `columns`
+A list of column names that should be written to the Excel sheet. The order of the columns in the sequence defines the order in which the columns will be written to the Excel sheet. Columns that are not present in the input table are ignored. + - Type: `sequence[string]` + - Is required -- Optional: `log2: bool`
---- *Note: Description missing* --- +For additional optional parameters refer to the [common optional template section parameters](#5-common-optional-template-section-parameters). -##### Global settings that specifically affect tag sample sections -- `log2_tag` -- `evaluate_log2_transformation` +### 2) Tag template section +You can use the `tag template section` to select a group of columns based on a regular expression pattern that is matched against all column names. This is useful when the column names are not constant between multiple tables, but contain a constant common and a variable part. For example, when you have columns named "Intensity Sample_1", "Intensity Sample_2", and so on, you can use a `tag template section` to select all columns that start with "Intensity". +> _**TIP:** Use the regular expression anchors `^` and `$` to specify if the column needs to begin or end with the tag. To exclude columns that are an exact match to the specified tag, add a `.` in front or after the tag to indicate that at least one additional character must be present._ -##### How does this look like in practice?** +#### Required and optional section parameters -Let's assume we have the following template, containing a `tag sample section` with the name "intensities": +- `tag`
+A regular expression pattern that is matched against all column names. Columns that match the pattern are selected for the section. Columns are written to the Excel sheet in the order they appear in the input table. + - Type: `string` + - Is required + +- `remove_tag`
+If True, the matched regular expression pattern of the `tag` is removed from the column name in the Excel sheet. Removing the tag is useful in combination with adding a `supheader` to the section, as in this way the removed tag can be displayed only in the supheader row. + - Type: `bool` + - Default: `False` + - Is optional + +- `log2`
+Use this parameter to apply a log2 transformation to the values of all selected columns in the `tag template section`. The global setting parameters `log2_tag` and `evaluate_log2_transformation` affect the behavior of the `tag template section` when the `log2` parameter is used. Refer to the description of the [table template settings](#the-settings-area-of-the-table-template) for more information. + - Type: `bool` + - Default: `False` + - Is optional + +For additional optional parameters refer to the [common optional template section parameters](#5-common-optional-template-section-parameters). + + +#### How does the `template tag section` look like in practice? + +Let's assume your have the following table template, containing a `tag sample section` with the name "intensities": ```YAML sections: - intensities: {tag: "^Intensity"} + intensities: {tag: "^Intensity."} ``` and a CSV file with the following columns -| Protein ID | Intensity sample_1 | Intensity sample_2 | Total Intensity | -| ---------- | ------------------ | ------------------ | --------------- | -| P40238 | 1,000,000 | 2,000,000 | 3,000,000 | +| Protein IDs | Intensity sample_1 | Intensity sample_2 | Mean Intensity | +| ----------- | ------------------ | ------------------ | -------------- | +| P40238 | 1,000 | 2,000 | 1,500 | + + When generating a report, XlsxReport selects columns that match the regular expression pattern `^Intensity.`, which results in the selection of the columns "Intensity sample_1" and "Intensity sample_2". The specified pattern requires that a column starts with "Intensity" and that "Intensity" is followed by an additional character. This pattern does not match the column "Mean Intensity", which is therefore not included in the `section`. + + +### 3) Label tag template section - When generating a report, XlsxReport selects columns that match the regular expression pattern specified with `tag`, i.e. "^Intensity.", which resulst in the selection of the columns "Intensity sample_1" and "Intensity sample_2". The specified patterns requires that a column starts with "Intensity", therefore the column "Total Intensity" is not selected. +The `label tag template section` is an extension of the `tag template section` that enables more precise control over the selection of columns. While you only specify the constant part of a column name with the `tag` parameter in the `tag template section`, the `label tag template section` allows you to also specify the variable part of the columns that should be included in the section. This is achieved with the `labels` parameter, which is a list representing the variable part of the columns that should be included in the `section`. For a column to be selected, the column name must exactly contain the constant part specified in the `tag` parameter and one of the variable parts specified in the `labels` parameter. +As the `label tag template section` requires preexisting knowledge of the variable part of column names, such as sample names that are expected to change between datasets, it is typically not used in `table templates` that are intended as a general template for a specific data analysis pipeline. However, it is very useful to dynamically adjust a table template to a specific dataset. For example, when you want to create a report but only select a subset of the columns that are matched by the `tag` parameter. -#### The comparison section +#### Required and optional section parameters ---- *Note: this section needs to be rewritten* --- +- `tag`
+A regular expression pattern that is matched against all column names. Columns that match the pattern are selected for the section. Columns are written to the Excel sheet in the order they appear in the input table. + - Type: `string` + - Is required -The **comparison group** allows defining a block of differential expression -comparison columns. A comparison group is defined by the parameters `tag` and `columns`. The columns that belong to a comparison group have a column -name that consists of one part that describes the content of the column, for -example "P-value" or "Fold change", and another part that describes which -samples or experiments are compared, for example "Control vs. Condition". To -identify comparison columns, the comparison symbol must be defined with -the "tag" parameter, in this example the "tag" corresponds to " vs. ", and -the strings that describe the column contents must be listed in the "columns" -parameter, in this example ["P-value", "Fold change"]. In this example the -comparison group would include the columns "P-value Control vs. Condition" and -"Fold change Control vs. Condition". +- `labels`
+A sequence of strings that represent the variable part of the column names that should be included in the section. + - Type: `sequence[string]` + - Is required -#### Common template section parameters -- `format: str`
---- *Note: Description missing* --- +- `remove_tag`
+If True, the matched regular expression pattern of the `tag` is removed from the column name in the Excel sheet. Removing the tag is useful in combination with adding a `supheader` to the section, as in this way the removed tag can be displayed only in the supheader row. + - Type: `bool` + - Default: `False` + - Is optional -- `column_format: str`
---- *Note: Description missing* --- +- `log2`
+Use this parameter to apply a log2 transformation to the values of all selected columns in the `tag template section`. The global setting parameters `log2_tag` and `evaluate_log2_transformation` affect the behavior of the `tag template section` when the `log2` parameter is used. Refer to the description of the [table template settings](#the-settings-area-of-the-table-template) for more information. + - Type: `bool` + - Default: `False` + - Is optional -- `conditional: str`
---- *Note: Description missing* --- +For additional optional parameters refer to the [common optional template section parameters](#5-common-optional-template-section-parameters). -- `column_conditional: str`
---- *Note: Description missing* --- -- `header_format: str`
---- *Note: Description missing* --- +### 4) Comparison template section -- `supheader: str`
---- *Note: Description missing* --- +The `comparison template section` allows you to select and group columns that represent pair-wise comparisons of conditions. Examples of such columns include statistical comparisons of differences between two experimental conditions or the ratio between the mean intensities of those conditions. To be included in a `comparison section`, column names must follow a specific logic: -- `supheader_format: str`
---- *Note: Description missing* --- +1. **Condition Description**: Column names must contain a part that describes the conditions being compared, such as "Control vs. Condition". It is crucial to have a consistent symbol between the two conditions to identify the comparison columns. This symbol is defined by the `tag` parameter. In the example, the `tag` is " vs. ". +2. **Comparison Type**: Column names must also include a part that describes the type of comparison, such as "P-value" or "Ratio". Multiple types of comparisons can be included in a `comparison section`, and the substrings that identify the comparison type are listed in the `columns` parameter. + + +#### Required and optional section parameters + +- `tag`
+A string that corresponds to the comparison symbol between two conditions. The `tag` is used to pre-select columns that might belong to the `comparison section`. + - Type: `string` + - Is required + +- `columns`
+A sequence of strings that correspond to the substrings that identify the comparison type in the column names. + - Type: `sequence[string]` + - Is required + +- `replace_comparison_tag`
+Optional parameter that allows you to replace the comparison tag with a different string. This can be useful if you want to change the comparison tag in the Excel sheet to make it more readable. + - Type: `string` + - Is optional + +- `remove_tag`
+If True, the condition comparison string is removed from the columns, leaving only the comparison type that was specified with the `columns` parameter. This option is useful in combination with adding a `supheader` to the section, as in this way the condition comparison string is only displayed in the supheader row. + - Type: `bool` + - Default: `False` + - Is optional + +For additional optional parameters refer to the [common optional template section parameters](#5-common-optional-template-section-parameters). + +> _**NOTE:** Several optional parameters work slightly different in this `section` type The parameters `column_format` and `column_conditional_format` are used to apply formats and conditional formats to columns that contain a specific comparison type, for example "P-value" or "Ratio". The column names that are specified for these parameters therefore need to correspond to entries from the `columns` parameter. The `supheader` parameter has no effect, as the supheader is automatically generated from the column names and corresponds to the conditions that are compared._ + + +#### How does the `comparison template section` work in practice? + +Let's look at an example to illustrate how the `comparison template section` works. Assume you have the following columns in your input table: + +- "Ratio Control vs. Condition" +- "Ratio Control vs. Another condition" +- "P-value Control vs. Condition" +- "P-value Control vs. Another condition" +- "Intensity Control vs. Condition" +- "Intensity Control vs. Another condition" + +And the following table template: + +```YAML +sections: + statistical_comparison: + tag: " vs. " + columns: ["P-value", "Ratio"] +``` -- `width: float`
---- *Note: Description missing* --- +When generating a report, XlsxReport first collects all comparison columns, then groups them according to the conditions that are compared. All columns that compare the same two conditions are then used to write a separate `section` to the Excel sheet: -- `border: bool`
---- *Note: Description missing* --- +In the example the first section contains the columns: +- "Ratio Control vs. Condition" +- "P-value Control vs. Condition" -### Format parameters area - `formats` -In the `formats` area the formats must be defined that are applied in the template sections. In addition, by specifying a format called "header" and "supheader" it is possible to define default formats for the header and supheader row. +And the second section contains the columns: + +- "Ratio Control vs. Another condition" +- "P-value Control vs. Another condition" + +The columns "Intensity Control vs. Condition" and "Intensity Control vs. Another condition" are not included in any `section`, since "Intensity" was not listed in the `columns` parameter of the `comparison section`. + + +### 5) Common optional template section parameters + +The following optional parameters can be used in all `template sections` types to specify formatting and other settings: + +- `format`
+The default format that is applied to all columns in the `section`. The format must be defined in the `formats` area of the table template. + - Type: `string` + +- `column_format`
+A mapping that specifies formats that are applied to individual columns in the `section`. The column format overrides the default format. The keys are column names, and the values are format names that are defined in the `formats` area of the table template. + - Type: `mapping[string, string]` + +- `conditional_format`
+The name of the conditional format that is applied to the values of all columns in the `section`. The conditional format must be defined in the `conditional_formats` area of the table template. + - Type: `string` + +- `column_conditional_format`
+A mapping that specifies the conditional format that is applied to the values of individual columns in the `section`. The keys are column names, and the values are conditional format names that are defined in the `conditional_formats` area of the table template. + - Type: `mapping[string, string]` + +- `header_format`
+Allows to specify additional formatting properties that are applied to the header format of the `section`. The specified formatting properties are added to the default "header" format that can be defined in the `formats` area of the table template. For more information about how to define formatting properties refer to the [documentation of the formats area](#the-formats-area-of-the-table-template). + - Type: `mapping[string, mapping]` + +- `supheader`
+A string that is written to the Excel sheet above the header row of the `section`. The `supheader` is written to a merged cell that spans all columns of the `section`. The `supheader` is only written if the global setting `write_supheader` is set to `True`. + - Type: `string ` + +- `supheader_format`
+Allows to specify additional formatting properties that are applied to the super header format of the `section`. The specified formatting properties are added to the default "supheader" format that can be defined in the `formats` area of the table template. For more information about how to define formatting properties refer to the [documentation of the formats area](#the-formats-area-of-the-table-template). + - Type: `mapping[string, mapping]` + +- `width`
+Defines the column widths in pixels. The width is applied to all columns in the `section` and overwrites the default column width that is defined in the `settings` area of the table template. + - Type: `float` + +- `border`
+If set to True, a thick border line is added to the left and right side of the section. + - Type: `boolean` + - Default: `False` + + +## The `formats` area of the table template + +In the `formats` area the formats must be defined that are applied in the `sections` area of the table template. In addition, by specifying a format called "header" and "supheader" it is possible to define the default formats for the header and supheader row. Refer to the [XlsxWriter](https://xlsxwriter.readthedocs.io/format.html#format-methods-and-format-properties) -documentation for additional information which parameters can be defined for a format. +documentation for additional information which parameters can be defined for a format. Note that entries of the **Property** column from the documentation correspond to the keys that can be defined in the format mapping. + +Here is an example of a `formats` area that defines the formats "int", "float", "str", "header", and "supheader": + +```YAML +formats: + int: {"align": "center", "num_format": "0"} + float: {"align": "center", "num_format": "0.00"} + str: {"align": "left", "num_format": "0"} + header: { + "bold": True, + "align": "center", + "valign": "vcenter", + "bottom": 2, + "top": 2, + "text_wrap": True + } + supheader: { + "bold": True, + "align": "center", + "valign": "vcenter", + "bottom": 2, + "left": 2, + "right": 2, + "text_wrap": True + } +``` + + +## The `conditional_formats` are of the table template -### Conditional format area - `conditional_formats` In the `conditional_formats` area the conditional formats must be defined that are applied in the template sections. -Refer to the [XlsxWriter](https://xlsxwriter.readthedocs.io/working_with_conditional_formats.html) documentation for additional information which parameters can be defined for a conditional format. +The type of conditional format needs to be defined with the `type` parameter. Currently only the types `2_color_scale`, `3_color_scale`, and `data_bar` are supported. In addition, the formatting parameters corresponding to the selected type must be defined. + +Refer to the [XlsxWriter](https://xlsxwriter.readthedocs.io/working_with_conditional_formats.html) documentation for additional information which parameters can be defined for different conditional format types. + +Here is an example of a `conditional_formats` area that defines a 3-color scale conditional format called "intensity": + +```YAML +conditional_formats: + intensity: { + "type": "3_color_scale", + "min_type": "min", "min_color": "#2c7bb6", + "mid_type": "percentile", "mid_value": 50, "mid_color": "#ffffbf", + "max_type": "max", "max_color": "#f25540" + } +``` -### Settings area - `settings` +## The `settings` area of the table template The `settings` area is used to define general settings affecting all content that is written to the Excel sheet. -- `supheader_height: float (default: 20)`
+- `supheader_height`
Defines the supheader row height in pixels. + - Type: `float` + - Default: `20` -- `header_height: float (default: 20)`
+- `header_height`
Defines the header row height in pixels. + - Type: `float` + - Default: `20` -- `column_width: float (default: 64)`
-Defines default column width. This parameter is overwritten if a `width` section parameter is defined. +- `column_width`
+Defines the default column widths in pixels. This parameter is overwritten by the `width` parameter specified in the `template sections`. + - Type: `float` + - Default: `64` -- `log2_tag: str (default "")`
-If specified this string is added as a suffix to the supheader or header of a tag section if the `log2` section parameter is defined, and a log2 transformation is applied to the column values. +- `log2_tag`
+If specified, this tag is added as a suffix to the column headers of a `tag template section` or a `label tag template section` to indicate that a log2 transformation has been applied. The `log2_tag` is only added if the section parameter `log2` is set to `True`. If the section parameter `remove_tag` is set to True, the `log2_tag` is added to the section supheader instead of the column headers. + - Type: `str` + - Default "" -- `append_remaining_columns: bool (default: False)`
+- `append_remaining_columns`
If True, all remaining columns that are not present in any section are added to the end of the Excel sheet, and the section of appended columns is hidden. + - Type: `bool` + - Default: `False` -- `write_supheader: bool (default: False)`
+- `write_supheader`
If True, a supheader row is added above the header row. + - Type: `bool` + - Default: `False` -- `evaluate_log2_transformation: bool (default: False)`
-If True, column values are evaluated if they appear to be already log transformed before a log2 transformation is applied. +- `evaluate_log2_transformation`
+**Use this setting with caution!** If True, column values are evaluated if they appear to be already log transformed before a log2 transformation is applied. Assumes that values are log transformed if all values in a column are smaller or equal to 64. Intensities values (and intensity peak areas) reported by tandem mass spectrometry typically range from 10^1 to 10^12. To reach log2 transformed values greater than 64, intensities would need to be higher than 10^19, which seems to be very unlikely to be ever encountered. + - Type: `bool` + - Default: `False` -- `remove_duplicate_columns: bool (default: True)`
-If True columns that are already present in a section are removed from subsequent sections. +- `remove_duplicate_columns`
+If True, columns that are already present in a section are removed from subsequent sections. This option guarantees that columns are not duplicated in the Excel sheet. + - Type: `bool` + - Default: `True` -- `add_autofilter: bool (default: True)`
+- `add_autofilter`
If True, adds an Excel auto filter to the header row. + - Type: `bool` + - Default: `True` -- `freeze_cols: int (default: 1)`
-If a value larger than 0 is specified, freeze pane is applied in the Excel sheet. The selected row for freezing will always be the header row, the selected column is chosen based on the specified value. +- `freeze_cols`
+If a value larger than 0 is specified, freeze pane is applied in the Excel sheet. The selected row for freezing will always be the header row, the position of the selected column is defined by specified value. + - Type: `int` + - Default: `1` From 5af716bcb4d6ebf2d706d724eecef7d2f8c2f959 Mon Sep 17 00:00:00 2001 From: "David M. Hollenstein" Date: Sat, 21 Sep 2024 12:32:34 +0200 Subject: [PATCH 7/9] docs: Add additional information about the project --- README.md | 47 +++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index b8e2704..d7bbae5 100644 --- a/README.md +++ b/README.md @@ -14,24 +14,27 @@ - [Installation](#installation) - [Setting up the application data directory](#setting-up-the-application-data-directory) - [Installation when using Anaconda](#installation-when-using-anaconda) -- [Upcoming features and work in progress](#upcoming-features-and-work-in-progress) +- [Additional project information](#additional-project-information) + - [Documenation](#documenation) + - [Upcoming features and work in progress](#upcoming-features-and-work-in-progress) + - [Do you have feedback or need help?](#do-you-have-feedback-or-need-help) ## What is XlsxReport? Well-formatted Excel reports are important for presenting and sharing data in a clear and structured manner with collaborators, in publications, and for the manual inspection of results. However, creating these reports manually is time-consuming, tedious, and has to be repeated for every new dataset and analysis. XlsxReport was developed to streamline the process of turning tabular data into formatted Excel reports. By automating this task, XlsxReport allows the creation of consistent, publication-ready Excel reports with minimal effort. -XlsxReport uses YAML template files to define the content, structure, and formatting of the generated Excel reports. The library provides a command-line interface and a Python API, allowing users to create Excel reports by applying table templates to tabular data. Although XlsxReport has been developed for quantitative mass spectrometry data, its versatile design makes it suitable for any type of tabular data. +XlsxReport uses YAML template files to define the content, structure, and formatting of the generated Excel reports. The library provides a command line interface and a Python API, allowing users to create Excel reports by applying table templates to tabular data. Although XlsxReport has been developed for quantitative mass spectrometry data, its versatile design makes it suitable for any type of tabular data. XlsxReport is actively developed as part of the computational toolbox for the [Mass Spectrometry Facility](https://www.maxperutzlabs.ac.at/research/facilities/mass-spectrometry-facility) at the Max Perutz Labs (University of Vienna). ## Getting Started with a simple example -With XlsxReport, generating reproducibly formatted Excel reports from your data analysis pipeline is a breeze - simply create a YAML table template once and execute a single terminal command to create Excel reports whenever needed. +With XlsxReport, generating reproducibly formatted Excel reports from your data analysis pipeline is a breeze - simply create a YAML table template once and execute a single command on the command line to create Excel reports whenever needed. Give it a try by using the provided example files in the `examples` directory. The `examples` directory contains a "proteinGroups.txt" file from MaxQuant, which can be turned into a formatted Excel report with the included default table template file "maxquant.yaml". -After installing XlsxReport and setting up the application data directory as described below, you can create an Excel report by running the following command in the terminal: +After installing XlsxReport and setting up the application data directory as described below, you can create an Excel report by running the following command in the command line: ```shell xlsxreport compile examples/proteinGroups.txt maxquant.yaml @@ -95,11 +98,39 @@ xlsxreport appdir --reveal ### Installation when using Anaconda -To install the XlsxReport package using Anaconda, you need to either activate a custom conda environment or install it into the default base environment. Open the Anaconda Navigator, activate the desired conda environment or use the base environment, and then open a terminal by running the "CMD.exe" application. Finally, use the `pip install` command as previously before. +To install the XlsxReport package using Anaconda, you need to either activate a custom conda environment or install it into the default base environment. Open the Anaconda Navigator, activate the desired conda environment or use the base environment, and then open a command line by running the "CMD.exe" application. Finally, use the `pip install` command as previously before. -## Upcoming features and work in progress +## Additional project information -The library has reached a stable state and we are currently working on **extending the documentation** and adding **minor feature enhancements**. In addition, we are planning to also release a **simple GUI** for creating Excel reports that provides the same functionality as the command-line interface. -If you have any feature requests, suggestions, or bug reports, please feel free to open an issue on the [GitHub issue tracker](https://github.com/hollenstein/xlsxreport/issues). \ No newline at end of file +### Documenation + +The documentation of XlsxReport is work in progress. In the meantime, you can find a detailed description of the table template and its formatting options in the [DOCUMENATION.md](https://github.com/hollenstein/xlsxreport/blob/main/DOCUMENTATION.md) file on the GitHub repository. + +The Python API is currently documented only in the source code. The stable public API comprises the functions and classes that are directly present in the `xlsxreport` namespace, please refer to the `xlsxreport/__init__.py` file for more information + +For more information about the **command line interface**, you can run the following command: + +```shell +xlsxreport --help +``` + +To get help for a specific command (`appdir`, `compile`, or `validate`), you can run: + +```shell +xlsxreport --help +``` + +You can find a comprehensive record of changes in the [CHANGELOG.md](https://github.com/hollenstein/xlsxreport/blob/main/CHANGELOG.md) file. + + +### Upcoming features and work in progress + +The library has reached a stable state and we are currently working on **extending the documentation** and adding **minor feature enhancements**. In addition, we are planning to release a **simple GUI** for creating Excel reports that provides the same functionality as the command line interface and lowers the barrier for users who are not comfortable with using the command line. + +### Do you have feedback or need help? + +If you have any feature requests, suggestions, or bug reports, please feel free to open an issue on the [GitHub issue tracker](https://github.com/hollenstein/xlsxreport/issues). + +You don't know how to use the library, or you have a question? Please feel free to contact us via email or on GitHub. We are happy to help you get started with XlsxReport and answer any questions you might have. From bb776ca9e565bf13d11dae04b75457aa8cf0834a Mon Sep 17 00:00:00 2001 From: "David M. Hollenstein" Date: Sat, 21 Sep 2024 12:55:50 +0200 Subject: [PATCH 8/9] docs: Update changelog --- CHANGELOG.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index edaa2b7..65176fe 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,18 @@ ---------------------------------------------------------------------------------------- +## Version - UPCOMING + +### Internal +- Added GitHub Actions for automatic testing of the package. + +### Documentation +- Updated the README.md file to provide more useful information about what XlsxReport is and the status of the project. +- Updated the DOCUMENTATION.md file with detailed information on how the table template works and which formatting options are available. +- Added an example output file from MaxQuant that can be used together with the "maxquant.yaml" template to create a formatted Excel report. + +---------------------------------------------------------------------------------------- + ## Version [0.1.0] - Interface and template rewrite Released: 2024-04-23 From ef49f0ec65c3d1e518e0fef5fbee7bd0c148052a Mon Sep 17 00:00:00 2001 From: "David M. Hollenstein" Date: Sat, 21 Sep 2024 12:56:38 +0200 Subject: [PATCH 9/9] Bump version to 0.1.1 --- CHANGELOG.md | 3 ++- xlsxreport/__init__.py | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 65176fe..85baa6a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,7 +2,8 @@ ---------------------------------------------------------------------------------------- -## Version - UPCOMING +## Version [0.1.1] - Documentation and CI with GitHub Actions +Released: 2024-09-21 ### Internal - Added GitHub Actions for automatic testing of the package. diff --git a/xlsxreport/__init__.py b/xlsxreport/__init__.py index e1447f3..0188518 100644 --- a/xlsxreport/__init__.py +++ b/xlsxreport/__init__.py @@ -24,4 +24,4 @@ __author__ = "David M. Hollenstein" __license__ = "Apache 2.0" -__version__ = "0.1.0" +__version__ = "0.1.1"