Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
59c9a5d
Feature: CI
AlexanderMann May 28, 2021
2a6573b
Trigger Build
AlexanderMann May 28, 2021
e4bf9fe
--
AlexanderMann May 28, 2021
363a22a
Feature: Local testing env
AlexanderMann May 28, 2021
da5b4ac
Housekeeping: Lint tap_circle_ci.py
AlexanderMann May 28, 2021
59820bb
Refactor: Remove redundant if block
AlexanderMann May 28, 2021
cf8665d
Housekeeping: Lint client.py
AlexanderMann May 28, 2021
ae8ea91
Housekeeping: Upodate doc links
AlexanderMann May 28, 2021
4cf8136
--
AlexanderMann May 28, 2021
61bc294
Refactor: Prep for items which aren't nested
AlexanderMann May 28, 2021
b02ad12
Feature: Steps
AlexanderMann May 28, 2021
5aaed03
Housekeeping: Lint streams.py
AlexanderMann May 28, 2021
3860b4a
Fix: Accidentally checked in changes to buffer
AlexanderMann May 28, 2021
67befdf
Housekeeping: Lint README.md
AlexanderMann May 28, 2021
cdc2266
Feature: Configure the buffer time
AlexanderMann May 28, 2021
e7d39b6
README: Update with state information
AlexanderMann May 28, 2021
7eb0c49
Merge pull request #1 from AlexanderMann/amann/feature/steps-stream
AlexanderMann May 28, 2021
4df475c
README: Add in reference to steps data
AlexanderMann May 29, 2021
95a49e3
README: Fix link to pipelines
AlexanderMann May 29, 2021
3c00ef3
Feature: Include all key_properties in records
AlexanderMann May 29, 2021
d558cf4
Fix: Single value schemas bork the generative tests
AlexanderMann May 29, 2021
8940113
Merge pull request #2 from AlexanderMann/amann/feature/key_properties
AlexanderMann May 29, 2021
7f036ac
Housekeeping: No longer need key_properties TODO
AlexanderMann May 29, 2021
3d8f99d
WIP
AlexanderMann May 29, 2021
f0649e6
Feture: Remove lag/buffer period
AlexanderMann May 31, 2021
306e7f1
Housekeeping: Remove unused dependency
AlexanderMann May 31, 2021
562744e
Debug: Detail the total memory footprint of pulling pipelines into me…
AlexanderMann May 31, 2021
fad6f12
Debug: Detail the total memory footprint of pulling pipelines into me…
AlexanderMann May 31, 2021
1785683
Housekeeping: Uninstrument code
AlexanderMann May 31, 2021
2ad215b
Merge pull request #3 from AlexanderMann/amann/feature/no-more-hacky-…
AlexanderMann May 31, 2021
35aa141
README: Remove unused config option
AlexanderMann May 31, 2021
0d81288
Merge pull request #4 from AlexanderMann/amann/feature/no-more-hacky-…
AlexanderMann May 31, 2021
ddbe8f5
Feature: Incremental bookmarks from stream
AlexanderMann Jun 1, 2021
dccee5d
Housekeeping: Don't need to immediately spit out state since we do th…
AlexanderMann Jun 1, 2021
98837a8
Merge pull request #5 from AlexanderMann/amann/feature/incremental-bo…
AlexanderMann Jun 1, 2021
a9460f8
Update README.md
JChouCode Jul 29, 2021
6dfcef5
Update README.md
JChouCode Jul 29, 2021
cb543b8
Edge Case: build_num not found
Jul 29, 2021
f8f0baf
Rename executable
Jul 29, 2021
ecc8110
Update executable
Jul 29, 2021
1fa58ae
Update README.md
JChouCode Jul 30, 2021
40aac89
Update README.md
JChouCode Jul 30, 2021
071a786
Debug: Get more information on failed gets
AlexanderMann Dec 21, 2022
1432382
Merge pull request #1 from apollographql/amann/debug/more-information…
AlexanderMann Dec 21, 2022
fa09c0f
Feature: log detailed information about missing pipelines, and return…
AlexanderMann Dec 21, 2022
45c6a55
Fix: continue -> break based on the comment...
AlexanderMann Dec 21, 2022
565edd3
Fix: logger.warn -> warning
AlexanderMann Dec 21, 2022
50113db
Merge pull request #2 from apollographql/amann/fix/aged-out-pipelines
AlexanderMann Dec 21, 2022
79b8ea0
Feature: More debugging information
AlexanderMann Jan 4, 2023
c334eec
Housekeeping: Give us information about whether we're even getting an…
AlexanderMann Jan 4, 2023
e04d177
Merge pull request #3 from apollographql/amann/housekeeping/more-debu…
AlexanderMann Jan 4, 2023
c55bc5a
Feature: cancel-pipeline.py
AlexanderMann Jan 5, 2023
187719a
Housekeeping: fewer placeholder vars
AlexanderMann Jan 5, 2023
6cb4c52
Merge pull request #4 from apollographql/amann/feature/tools-for-bulk…
AlexanderMann Jan 5, 2023
747ab20
Fix: CircleCI does not allow the state of aged out pipelines to change
AlexanderMann Jul 21, 2023
2566d16
Merge pull request #6 from apollographql/amann/jsegaran/fix/pipeline_…
AlexanderMann Jul 21, 2023
b2eb3e7
SECOPS-2268: Add Gitleaks to CI (#7)
peakematt Nov 16, 2023
3e42b99
Feature: Pull available useful fields from jobs api
AlexanderMann Nov 17, 2023
8f94649
Merge pull request #9 from apollographql/amann/feature/pull-in-execut…
AlexanderMann Nov 17, 2023
5d96c0d
add default CODEOWNERS (#8)
svc-secops Nov 30, 2023
d291541
Init
AlexanderMann Dec 1, 2023
dd7277c
--
AlexanderMann Dec 1, 2023
d809d00
Fix: Tests referencing old value
AlexanderMann Dec 1, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
version: 2.1

orbs:
secops: apollo/[email protected]

workflows:
version: 2
test:
jobs:
- test
security-scans:
jobs:
- secops/gitleaks:
context:
- platform-docker-ro
- github-orb
- secops-oidc
git-base-revision: <<#pipeline.git.base_revision>><<pipeline.git.base_revision>><</pipeline.git.base_revision >>
git-revision: << pipeline.git.revision >>

cache: &cache v0-{{ checksum "setup.py" }}

jobs:
test:
docker:
- image: python:3.9.5-buster
working_directory: /code/
steps:
- checkout
- restore_cache:
keys:
- *cache

- run:
name: Install tap-circle-ci
command: |
python -m venv venv/tap-circle-ci
source venv/tap-circle-ci/bin/activate
pip install -e .[tests]
deactivate

- save_cache:
key: *cache
paths:
- "./venv"
- "/usr/local/bin"
- "/usr/local/lib/python3.9/site-packages"

- run:
name: Run Tests
command: |
source venv/tap-circle-ci/bin/activate
pytest --verbose

- store_artifacts:
path: target/test-results
destination: raw-test-output
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,7 @@
*.egg
__pycache__/

sample_config.json
venv/
catalog.json
config.json
state.json
4 changes: 4 additions & 0 deletions CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# This file was automatically generated by the Apollo SecOps team
# Please customize this file as needed prior to merging.

* @apollographql/data
197 changes: 129 additions & 68 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,26 @@
# tap-circle-ci

# tap-circleci

## Meltano Extractor Setup

```
meltano add extractor --custom tap-circleci
```
In the interactive portion, use these variables
```
name: tap-circleci
pypi: git+https://github.com/JChouCode/tap-circleci.git
executable: tap-circleci
capabilities: discover,catalog,state
config: project_slugs,token:password
```

## Improvements
This fork improves the tap to handle edge cases that cause errors.
- Edge Case: Job is cancelled and build number is not created, causing a 404 error when requesting unknown build number.
- Improved Bookmarking
- Added `tooling/` for various scripts which help wrangle some of the sharp corners of CircleCI

## Sisu Data - About
This is a [Singer](https://singer.io) tap that produces JSON-formatted data
following the [Singer
spec](https://github.com/singer-io/getting-started/blob/master/SPEC.md).
Expand All @@ -8,94 +29,134 @@ This tap:

- Pulls raw data from [Circle CI](https://circleci.com/)
- Extracts the following resources:
- [Pipelines](https://circleci.com/docs/api/v2/#get-all-pipelines)
- [Workflows](https://circleci.com/docs/api/v2/#get-a-pipeline-39-s-workflows)
- [Jobs](https://circleci.com/docs/api/v2/#get-a-workflow-39-s-jobs)
- [Pipelines](hhttps://circleci.com/docs/api/v2/#operation/listPipelines)
- [Workflows](https://circleci.com/docs/api/v2/#operation/listWorkflowsByPipelineId)
- [Jobs](https://circleci.com/docs/api/v2/#operation/listWorkflowJobs)
- [Steps](https://circleci.com/docs/api/#single-job)
- Outputs the schema for each resource
- Incrementally pulls data based on the input state


## Quick start

1. Install

```bash
git clone [email protected]:sisudata/tap-circle-ci.git && cd tap-circle-ci && pip install -e .
```
```bash
git clone [email protected]:apollographql/tap-circleci.git && cd tap-circleci && pip install -e .
```

2. Create a Circle CI access token

Login to your Circle CI account, go to the
[Personal API Tokens](https://circleci.com/account/api)
page, and generate a new token. Copy the token and save it somewhere safe.
Login to your Circle CI account, go to the
[Personal API Tokens](https://circleci.com/account/api)
page, and generate a new token. Copy the token and save it somewhere safe.

3. Create the config file
3. Create the config file (see below)

Create a JSON file containing the token you just created as well as the project slug to the project you want to extract data from. Retrieve the project slug
from the url for a workflow - it should be the VCS your project uses (`gh` for Github or `bb` for Bitbucket), followed by the owner or organization, followed by the repository name
ex. `gh/singer-io/singer-python`. You can enter multiple project slugs separated by spaces to pull data from multiple projects.
Create a JSON file containing the token you just created as well as the project slug to the project you want to extract data from. Retrieve the project slug
from the url for a workflow - it should be the VCS your project uses (`gh` for Github or `bb` for Bitbucket), followed by the owner or organization, followed by the repository name
ex. `gh/singer-io/singer-python`. You can enter multiple project slugs separated by spaces to pull data from multiple projects.

```json
{
"token": "your-access-token",
"project_slugs": "gh/singer-io/singer-python gh/singer-io/getting-started"
}
```

```json
{
"token": "your-access-token",
"project_slugs": "gh/singer-io/singer-python gh/singer-io/getting-started"
}
```
4. Run the tap in discovery mode to get catalog.json file

```bash
tap-circle-ci --config config.json --discover > catalog.json
```
```bash
tap-circleci --config config.json --discover > catalog.json
```

5. In the catalog.json file, select the streams to sync

Each stream in the properties.json file has a "metadata" entry. To select a stream to sync, add
`{"breadcrumb": [], "metadata": {"selected": true}}` to that stream's "metadata" entry.
For example, to sync the pipelines stream:
```
...
"type": [
"null",
"object"
],
"additionalProperties": false
},
"stream": "pipelines",
"metadata": [{"breadcrumb": [], "metadata": {"selected": true}}]
},
...
```
Another way to select a stream to sync is to add `"selected": true` into that stream's schema:

```
...
"tap_stream_id": "workflows",
"key_properties": [],
"schema": {
"selected": true,
"properties": {
"_pipeline_id": {
"type": [
"null",
"string"
]
...
```
Either way is acceptable, but the first way is preferred.
Each stream in the properties.json file has a "metadata" entry. To select a stream to sync, add
`{"breadcrumb": [], "metadata": {"selected": true}}` to that stream's "metadata" entry.
For example, to sync the pipelines stream:

```
...
"type": [
"null",
"object"
],
"additionalProperties": false
},
"stream": "pipelines",
"metadata": [{"breadcrumb": [], "metadata": {"selected": true}}]
},
...
```

Another way to select a stream to sync is to add `"selected": true` into that stream's schema:

```
...
"tap_stream_id": "workflows",
"key_properties": [],
"schema": {
"selected": true,
"properties": {
"_pipeline_id": {
"type": [
"null",
"string"
]
...
```

Either way is acceptable, but the first way is preferred.

6. Run the application (will print records and other messages to the console)

`tap-circle-ci` can be run with:
`tap-circleci` can be run with:

```bash
tap-circleci --config config.json --catalog catalog.json
```

To save output to a file:

```bash
tap-circleci --config config.json --catalog catalog.json > output.txt
```

It is our intention that this singer tap gets used with a singer target, which will load the output into a database.
More information on singer targets [here](https://github.com/singer-io/getting-started/blob/master/docs/RUNNING_AND_DEVELOPING.md#running-a-singer-tap-with-a-singer-target).

7. To rerun using the last output `STATE` record:

In your output records, you will see something like:

```json
{
"type": "STATE",
"value": {
"bookmarks": {
"gh/apollographql/tap-circleci": {
"pipelines": { "since": "2023-11-15T00:00:00.000000Z" }
}
}
}
}
```

Select the `value` key, store it to a JSON file, and run:

```bash
tap-circleci --config config.json --catalog catalog.json --state state.json
```

## Configuration

Detailed configuration information for the `--config` key.

```bash
tap-circle-ci --config config.json --catalog catalog.json
```
| key | type | default | description |
| --------------- | -------- | ------- | ------------------------------------------------------ |
| `token` | `string` | `N/A` | [Personal API Token](https://circleci.com/account/api) |
| `project_slugs` | `string` | `N/A` | Space ` ` delimited string of CCI project slugs |

To save output to a file:
```bash
tap-circle-ci --config config.json --catalog catalog.json > output.txt
```
It is our intention that this singer tap gets used with a singer target, which will load the output into a database.
More information on singer targets [here](https://github.com/singer-io/getting-started/blob/master/docs/RUNNING_AND_DEVELOPING.md#running-a-singer-tap-with-a-singer-target).
---

Copyright &copy; 2020 Sisu Data
10 changes: 10 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
version: '3'
services:
tap:
image: python:3.9.5-buster
entrypoint: /code/docker-entrypoint.sh
working_dir: /code
environment:
DEPLOYMENT: "${DEPLOYMENT}"
volumes:
- .:/code
11 changes: 11 additions & 0 deletions docker-entrypoint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/usr/bin/env bash

python -m venv venv/tap-circle-ci
source /code/venv/tap-circle-ci/bin/activate

pip install -e .[tests]

echo "source /code/venv/tap-circle-ci/bin/activate" >> ~/.bashrc
echo -e "\n\nINFO: Dev environment ready."

tail -f /dev/null
5 changes: 5 additions & 0 deletions pytest.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[pytest]
filterwarnings =
error
ignore::UserWarning
ignore:.*Using or importing the ABCs from:DeprecationWarning
4 changes: 4 additions & 0 deletions sample-config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"token": "TODO",
"project_slugs": "gh/singer-io/singer-python gh/singer-io/getting-started gh/apollographql/tap-circleci"
}
7 changes: 7 additions & 0 deletions sample-state.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"bookmarks": {
"gh/apollographql/tap-circle-ci": {
"pipelines": { "since": "2023-11-15T00:00:00.000000Z" }
}
}
}
13 changes: 8 additions & 5 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from setuptools import setup

setup(
name="tap-circle-ci",
name="tap-circleci",
version="0.1.0",
description="Singer.io tap for extracting data from circle ci",
author="Sisu Data",
Expand All @@ -11,13 +11,16 @@
py_modules=["tap_circle_ci"],
install_requires=[
"singer-python>=5.0.12",
"requests",
"pytest",
"mock"
"requests"
],
extras_require={
'tests': [
"mock",
"pytest"
]},
entry_points="""
[console_scripts]
tap-circle-ci=tap_circle_ci:main
tap-circleci=tap_circle_ci:main
""",
packages=["tap_circle_ci"],
package_data={
Expand Down
Loading