Skip to content

Commit 82f24b5

Browse files
authored
Merge branch 'master' into feat/burncloud
2 parents 90888e0 + bb7d65f commit 82f24b5

File tree

208 files changed

+12485
-4072
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

208 files changed

+12485
-4072
lines changed

.github/workflows/dev-build.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ concurrency:
66

77
on:
88
push:
9-
branches: ['4034-version-control'] # put your current branch to create a build. Core team only.
9+
branches: ['upload-ui-ux'] # put your current branch to create a build. Core team only.
1010
paths-ignore:
1111
- '**.md'
1212
- 'cloud-deployments/*'

.github/workflows/run-tests.yaml

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
name: Run backend tests
2+
3+
concurrency:
4+
group: build-${{ github.ref }}
5+
cancel-in-progress: true
6+
7+
on:
8+
pull_request:
9+
types: [opened, synchronize, reopened]
10+
paths:
11+
- "server/**.js"
12+
- "collector/**.js"
13+
14+
jobs:
15+
run-script:
16+
runs-on: ubuntu-latest
17+
18+
steps:
19+
- name: Checkout repository
20+
uses: actions/checkout@v2
21+
22+
- name: Set up Node.js
23+
uses: actions/setup-node@v3
24+
with:
25+
node-version: '18'
26+
27+
- name: Cache root dependencies
28+
uses: actions/cache@v3
29+
with:
30+
path: |
31+
node_modules
32+
~/.cache/yarn
33+
key: ${{ runner.os }}-yarn-root-${{ hashFiles('**/yarn.lock') }}
34+
restore-keys: |
35+
${{ runner.os }}-yarn-root-
36+
37+
- name: Cache server dependencies
38+
uses: actions/cache@v3
39+
with:
40+
path: |
41+
server/node_modules
42+
~/.cache/yarn
43+
key: ${{ runner.os }}-yarn-server-${{ hashFiles('server/yarn.lock') }}
44+
restore-keys: |
45+
${{ runner.os }}-yarn-server-
46+
47+
- name: Cache collector dependencies
48+
uses: actions/cache@v3
49+
with:
50+
path: |
51+
collector/node_modules
52+
~/.cache/yarn
53+
key: ${{ runner.os }}-yarn-collector-${{ hashFiles('collector/yarn.lock') }}
54+
restore-keys: |
55+
${{ runner.os }}-yarn-collector-
56+
57+
- name: Install root dependencies
58+
if: steps.cache-root.outputs.cache-hit != 'true'
59+
run: yarn install --frozen-lockfile
60+
61+
- name: Install server dependencies
62+
if: steps.cache-server.outputs.cache-hit != 'true'
63+
run: cd server && yarn install --frozen-lockfile
64+
65+
- name: Install collector dependencies
66+
if: steps.cache-collector.outputs.cache-hit != 'true'
67+
run: cd collector && yarn install --frozen-lockfile
68+
69+
- name: Setup environment and Prisma
70+
run: yarn setup:envs && yarn prisma:setup
71+
72+
- name: Run test suites
73+
run: yarn test
74+
75+
- name: Fail job on error
76+
if: failure()
77+
run: exit 1

BARE_METAL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ curl -I "http://localhost:3001/api/env-dump" | head -n 1|cut -d$' ' -f2
8686
echo "Rebuilding Frontend"
8787
cd $HOME/anything-llm/frontend && yarn && yarn build && cd $HOME/anything-llm
8888

89-
echo "Copying to Sever Public"
89+
echo "Copying to Server Public"
9090
rm -rf server/public
9191
cp -r frontend/dist server/public
9292

CONTRIBUTING.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
# Contributing to AnythingLLM
2+
3+
AnythingLLM is an open-source project and we welcome contributions from the community.
4+
5+
## Reporting Issues
6+
7+
If you encounter a bug or have a feature request, please open an issue on the
8+
[GitHub issue tracker](https://github.com/mintplex-labs/anything-llm).
9+
10+
## Picking an issue
11+
12+
We track issues on the GitHub issue tracker. If you are looking for something to
13+
work on, check the [good first issue](https://github.com/mintplex-labs/anything-llm/contribute) label. These issues are typically the best described and have the smallest scope. There may be issues that are not labeled as good first issue, but are still a good starting point.
14+
15+
If there's an issue you are interested in working on, please leave a comment on the issue. This will help us avoid duplicate work. Additionally, if you have questions about the issue, please ask them in the issue comments. We are happy to provide guidance on how to approach the issue.
16+
17+
## Before you start
18+
19+
Keep in mind that we are a small team and have limited resources. We will do our best to review and merge your PRs, but please be patient. Ultimately, **we become the maintainer** of your changes. It is our responsibility to make sure that the changes are working as expected and are of high quality as well as being compatible with the rest of the project both for existing users and for future users & features.
20+
21+
Before you start working on an issue, please read the following so that you don't waste time on something that is not a good fit for the project or is more suitable for a personal fork. We would rather answer a comment on an issue than close a PR after you've spent time on it. Your time is valuable and we appreciate your time and effort to make AnythingLLM better.
22+
23+
0. (most important) If you are making a PR that does not have a corresponding issue, **it will not be merged.** _The only exception to this is language translations._
24+
25+
1. If you are modifying the permission system for a new role or something custom, you are likely better off forking the project and building your own version since this is a core part of the project and is only to be maintained by the AnythingLLM team.
26+
27+
2. Integrations (LLM, Vector DB, etc.) are reviewed at our discretion. We will eventually get to them. Do not expect us to merge your integration PR instantly since there are often many moving parts and we want to make sure we get it right. We will get to it!
28+
29+
3. It is our discretion to merge or not merge a PR. We value every contribution, but we also value the quality of the code and the user experience we envision for the project. It is a fine line to walk when running a project like this and please understand that merging or not merging a PR is not a reflection of the quality of the contribution and is not personal. We will do our best to provide feedback on the PR and help you make the changes necessary to get it merged.
30+
31+
4. **Security** is always important. If you have a security concern, please do not open an issue. Instead, please open a CVE on our designated reporting platform [Huntr](https://huntr.com) or contact us at [[email protected]](mailto:[email protected]).
32+
33+
## Configuring Git
34+
35+
First, fork the repository on GitHub, then clone your fork:
36+
37+
```bash
38+
git clone https://github.com/<username>/anything-llm.git
39+
cd anything-llm
40+
```
41+
42+
Then add the main repository as a remote:
43+
44+
```bash
45+
git remote add upstream https://github.com/mintplex-labs/anything-llm.git
46+
git fetch upstream
47+
```
48+
49+
## Setting up your development environment
50+
51+
In the root of the repository, run:
52+
53+
```bash
54+
yarn setup
55+
```
56+
57+
This will install the dependencies, set up the proper and expected ENV files for the project, and run the prisma setup script.
58+
Next, run:
59+
60+
```bash
61+
yarn dev:all
62+
```
63+
This will start the server, frontend, and collector in development mode. Changes to the code will be hot reloaded.
64+
65+
## Best practices for pull requests
66+
67+
For the best chance of having your pull request accepted, please follow these guidelines:
68+
69+
1. Unit test all bug fixes and new features. Your code will not be merged if it
70+
doesn't have tests.
71+
1. If you change the public API, update the documentation in the `anythingllm-docs` repository.
72+
1. Aim to minimize the number of changes in each pull request. Keep to solving
73+
one problem at a time, when possible.
74+
1. Before marking a pull request ready-for-review, do a self review of your code.
75+
Is it clear why you are making the changes? Are the changes easy to understand?
76+
1. Use [conventional commit messages](https://www.conventionalcommits.org/en/) as pull request titles. Examples:
77+
* New feature: `feat: adding foo API`
78+
* Bug fix: `fix: issue with foo API`
79+
* Documentation change: `docs: adding foo API documentation`
80+
1. If your pull request is a work in progress, leave the pull request as a draft.
81+
We will assume the pull request is ready for review when it is opened.
82+
1. When writing tests, test the error cases. Make sure they have understandable
83+
error messages.
84+
85+
## Project structure
86+
87+
The core library is written in Node.js. There are additional sub-repositories for the embed widget and browser extension. These are not part of the core AnythingLLM project, but are maintained by the AnythingLLM team.
88+
89+
* `server`: Node.js server source code
90+
* `frontend`: React frontend source code
91+
* `collector`: Python collector source code
92+
93+
## Release process
94+
95+
Changes to the core AnythingLLM project are released through the `master` branch. When a PR is merged into `master`, a new version of the package is published to Docker and GitHub Container Registry under the `latest` tag.
96+
97+
When a new version is released, the following steps are taken a new image is built and pushed to Docker Hub and GitHub Container Registry under the assoicated version tag. Version tags are of the format `v<major>.<minor>.<patch>` and are pinned code, while `latest` is the latest version of the code at any point in time.
98+
99+
### Desktop propogation
100+
101+
Changes to the desktop app are downstream of the core AnythingLLM project. Releases of the desktop app are published at the same time as the core AnythingLLM project. Code from the core AnythingLLM project is copied into the desktop app into an Electron wrapper. The Electron wrapper that wraps around the core AnythingLLM project is **not** part of the core AnythingLLM project, but is maintained by the AnythingLLM team.
102+
103+
## License
104+
105+
By contributing to AnythingLLM (this repository), you agree to license your contributions under the MIT license.

README.md

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,7 @@ AnythingLLM divides your documents into objects called `workspaces`. A Workspace
101101
- [xAI](https://x.ai/)
102102
- [Novita AI (chat models)](https://novita.ai/model-api/product/llm-api?utm_source=github_anything-llm&utm_medium=github_readme&utm_campaign=link)
103103
- [PPIO](https://ppinfra.com?utm_source=github_anything-llm)
104+
- [Moonshot AI](https://www.moonshot.ai/)
104105

105106
**Embedder models:**
106107

@@ -135,7 +136,7 @@ AnythingLLM divides your documents into objects called `workspaces`. A Workspace
135136
- [PGVector](https://github.com/pgvector/pgvector)
136137
- [Astra DB](https://www.datastax.com/products/datastax-astra)
137138
- [Pinecone](https://pinecone.io)
138-
- [Chroma](https://trychroma.com)
139+
- [Chroma & ChromaCloud](https://trychroma.com)
139140
- [Weaviate](https://weaviate.io)
140141
- [Qdrant](https://qdrant.tech)
141142
- [Milvus](https://milvus.io)
@@ -222,12 +223,9 @@ We take privacy very seriously, and we hope you understand that we want to learn
222223

223224
</details>
224225

225-
226226
## 👋 Contributing
227227

228-
- create issue
229-
- create PR with branch name format of `<issue number>-<short name>`
230-
- LGTM from core-team
228+
- [Contributing to AnythingLLM](./CONTRIBUTING.md) - How to contribute to AnythingLLM.
231229

232230
## 💖 Sponsors
233231

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
const { YoutubeTranscript } = require("../../../../../utils/extensions/YoutubeTranscript/YoutubeLoader/youtube-transcript.js");
2+
3+
describe("YoutubeTranscript", () => {
4+
it("should fetch transcript from YouTube video", async () => {
5+
const videoId = "BJjsfNO5JTo";
6+
const transcript = await YoutubeTranscript.fetchTranscript(videoId, {
7+
lang: "en",
8+
});
9+
10+
expect(transcript).toBeDefined();
11+
expect(typeof transcript).toBe("string");
12+
expect(transcript.length).toBeGreaterThan(0);
13+
// console.log("Success! Transcript length:", transcript.length);
14+
// console.log("First 200 characters:", transcript.substring(0, 200) + "...");
15+
}, 30000);
16+
});

collector/index.js

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,39 @@ app.post(
5858
}
5959
);
6060

61+
app.post(
62+
"/parse",
63+
[verifyPayloadIntegrity],
64+
async function (request, response) {
65+
const { filename, options = {} } = reqBody(request);
66+
try {
67+
const targetFilename = path
68+
.normalize(filename)
69+
.replace(/^(\.\.(\/|\\|$))+/, "");
70+
const {
71+
success,
72+
reason,
73+
documents = [],
74+
} = await processSingleFile(targetFilename, {
75+
...options,
76+
parseOnly: true,
77+
});
78+
response
79+
.status(200)
80+
.json({ filename: targetFilename, success, reason, documents });
81+
} catch (e) {
82+
console.error(e);
83+
response.status(200).json({
84+
filename: filename,
85+
success: false,
86+
reason: "A processing error occurred.",
87+
documents: [],
88+
});
89+
}
90+
return;
91+
}
92+
);
93+
6194
app.post(
6295
"/process-link",
6396
[verifyPayloadIntegrity],

collector/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "anything-llm-document-collector",
3-
"version": "1.8.2",
3+
"version": "1.8.5",
44
"description": "Document collector server endpoints",
55
"main": "index.js",
66
"author": "Timothy Carambat (Mintplex Labs)",

collector/processLink/convert/generic.js

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -62,10 +62,10 @@ async function scrapeGenericUrl({
6262
token_count_estimate: tokenizeString(content),
6363
};
6464

65-
const document = writeToServerDocuments(
65+
const document = writeToServerDocuments({
6666
data,
67-
`url-${slugify(filename)}-${data.id}`
68-
);
67+
filename: `url-${slugify(filename)}-${data.id}`,
68+
});
6969
console.log(`[SUCCESS]: URL ${link} converted & ready for embedding.\n`);
7070
return { success: true, reason: null, documents: [document] };
7171
}

collector/processRawText/index.js

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -58,10 +58,10 @@ async function processRawText(textContent, metadata) {
5858
token_count_estimate: tokenizeString(textContent),
5959
};
6060

61-
const document = writeToServerDocuments(
61+
const document = writeToServerDocuments({
6262
data,
63-
`raw-${stripAndSlug(metadata.title)}-${data.id}`
64-
);
63+
filename: `raw-${stripAndSlug(metadata.title)}-${data.id}`,
64+
});
6565
console.log(`[SUCCESS]: Raw text and metadata saved & ready for embedding.\n`);
6666
return { success: true, reason: null, documents: [document] };
6767
}

0 commit comments

Comments
 (0)