-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query Limits Management Feature #7
base: main
Are you sure you want to change the base?
Query Limits Management Feature #7
Conversation
…l students. TODO: Implement a better way of adding in the button to the datatable instead of hardcoding as it working right now
All contributors have signed the CLA ✍️ ✅ |
I have read and hereby sign the CLA. |
recheck |
Thanks for your contribution! It's substantial, so it will take me some time to review it. It looks like some tests are failing, and I'll probably also have thoughts on style and organization in the changes to keep things consistent with the current design. If I do, do you have a preference for whether I pass comments back to you to make changes vs making those changes myself? I'm fine either way, and I don't want to burden you with my requests if you don't have the capacity. |
Hello,
Thank you for reaching out! I don’t have a preference—feel free to make the changes yourself if you’d like, or I can handle them if you’re busy. Regarding the failing tests, I suspect the issue might be related to line 156 in db.py. I couldn’t get the database to initialize with the “encoding” parameter, which might be due to differences in how Python handles this on Mac vs. Windows.
Best,
Rana Moeez Hassan
LinkedIn <https://www.linkedin.com/in/rana-moeez-hassan-042ab325a/>
GitHub <https://github.com/ranamoeezhassan>
… On Jan 17, 2025, at 12:24 AM, Mark Liffiton ***@***.***> wrote:
Thanks for your contribution! It's substantial, so it will take me some time to review it. It looks like some tests are failing, and I'll probably also have thoughts on style and organization in the changes to keep things consistent with the current design. If I do, do you have a preference for whether I pass comments back to you to make changes vs making those changes myself? I'm fine either way, and I don't want to burden you with my requests if you don't have the capacity.
—
Reply to this email directly, view it on GitHub <https://urldefense.com/v3/__https://github.com/liffiton/Gen-Ed/pull/7*issuecomment-2597459820__;Iw!!P_zEGVH0kSMiWA!Bc5cX69kVTVe6DwRI_CpcBgLfJR7LWtZwTEb7njAfpDXMcQFYkX82LyIuyc7bhtT3-bGkVPSDI-bY1v2DKb8mfaBKBw$>, or unsubscribe <https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/A45HDYC6HDF66UMEAC5AMSL2LCHYRAVCNFSM6AAAAABVLDFIKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJXGQ2TSOBSGA__;!!P_zEGVH0kSMiWA!Bc5cX69kVTVe6DwRI_CpcBgLfJR7LWtZwTEb7njAfpDXMcQFYkX82LyIuyc7bhtT3-bGkVPSDI-bY1v2DKb8IwLm9TQ$>.
You are receiving this because you authored the thread.
|
Okay, great! I'll review the PR and make comments through it. If you can handle at least the simpler ones, that would be great. And feel free to respond to a comment if it's not clear or you disagree or anything. Then at some point, I might merge this into a development branch to do a bit more work on it before merging it into main. I'm seeing a few places where I might want to tweak the UI or the structure of some code to match how I've been designing things. Are you developing on a Mac? I haven't tested this on a Mac myself. If you run into any issues running the code (I guess whether on a Mac or not), please do open an issue for it. I'm primarily focused on Linux, but if I can help folks develop on Windows or Mac without too much trouble, I'm happy to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've found some things to clean up or improve. The database design issue is a bigger one that might take some work to get in shape. I haven't reviewed everything quite yet, and I'll review more once these issues are addressed.
ALTER TABLE classes ADD COLUMN max_queries INTEGER NOT NULL DEFAULT 50; | ||
|
||
-- Ensure the users table has a column to track the number of queries used | ||
ALTER TABLE users ADD COLUMN queries_used INTEGER NOT NULL DEFAULT 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is just tied to the user, then it won't work correctly for users in multiple classes, right? I think this might need a different design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I did not realize that a student account could be in multiple classes. Will need to rethink this.
src/codehelp/migrations/20250112--codehelp--add_query_limits.sql
Outdated
Show resolved
Hide resolved
BEGIN; | ||
|
||
-- Add columns to the classes table | ||
ALTER TABLE classes ADD COLUMN query_limit_enabled BOOLEAN NOT NULL DEFAULT 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this needs to be a separate column, or if it would work (and be a bit simpler) to just have a query_limit column that's either an integer (when there is a limit active) or NULL when there is no limit in a class. I'm leaning toward the simpler design, unless there's a reason I'm not seeing to have two columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the simpler design would definitely work. We would just need to modify a few SQL statements that are looking for this column to check if the query_limit is either NULL or an integer. And of course, set the value accordingly when the instructor either enables or disables this feature.
@@ -53,6 +53,23 @@ class ResponseCol(Col): | |||
kind: Final = 'html' | |||
prerender: Final[Callable[[str], str]] = fmt_response_txt | |||
|
|||
@dataclass(frozen=True) | |||
class ButtonCol(Col): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An earlier comment suggests this isn't currently used, and I don't see any uses (might be missing something). If it's not used, let's not include it or anything related to it in the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left it in because I was trying to avoid hardcoding the HTML in the database statement like you mentioned, and I wanted a cleaner implementation. But I couldn’t get it to work. I kept it in in case you have any ideas why creating a new class like this and adding it to the data table didn’t work. The fmt_button_text in filters.py was also supposed to work with this class to make the button work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, yeah. I think the general approach could work, but I can't really say why it didn't without seeing more and digging into it some. But like I think I mentioned somewhere else, the Actions list in the Datatable macro/class should enable the functionality you're looking for here, so sticking with that will be simplest.
{% block body %} | ||
{% block body %}' | ||
<head> | ||
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not use FontAwesome. I'm including SVG for any icons I use, mostly from Lucide. I can handle that myself if you want to just leave a placeholder in place of any icon you're using from FontAwesome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed icon, and copyright info and put a placeholder in place (still hardcoded though).
json_array(users.display_name, auth_providers.name, users.display_extra) AS user, | ||
users.queries_used, | ||
classes.max_queries, | ||
('<form method="POST" action="/instructor/reset_student_queries/' || users.id || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generating HTML in the SQL statement is an overly-complex way to accomplish this. I think using the actions
list in the DataTable can handle this in a much simpler way. Let me know if that doesn't work here for some reason.
Thanks for those updates. Looking good so far! If you have ideas on the database design, feel free to share them here and we can discuss before implementing something. I think it might be best to attach per-user-per-class counts or limits to roles, as roles are what define the user-to-class relationship. I see the note in the commit message about tests failing on MacOS now. Please do open an issue about that if you can provide more information. Sorry I'm slow in responding to things here. I have a lot less time for it during the week. |
Thanks for following up! Totally get having less time during the week—it’s the same for me. I managed to fix the pytest errors I was running into; turns out all it needed was a quick reinstallation of the dependencies! I had two ideas. One was pretty rough (and inefficient): adding a separate column for each class in the users table. But I also thought about something similar to your "per-user-per-class counts" idea, which seems way more promising. I’m not entirely sure I understand the roles limit idea, could you clarify how that would work? |
Sure! So every user/class registration is a row in the As an alternative, and possibly cleaner, we could store a query limit in the And then there's the question of storing query counts. I think I'd prefer to store limits but not user query counts. We can instead rely on database queries to generate counts as needed, as long as the performance of that is fine. Storing a count raises issues of having to always ensure it is consistent with the actual stored queries. Having a single source of truth for the count (the queries themselves) seems cleaner. And I think as long as the right database indexes are in place, an SQL query to count a user's queries should run really fast. Which design would fit your requirements? I have a preference for simpler solutions as long as they meet current needs, unless it's clear that a more complex solution would be necessary soon anyway. (To be honest, I'm not clear on the use case for these limits in general, as it has never been a problem across the thousands of students who have used CodeHelp so far. With no limits in place, the most prolific student of all (so far) had usage amounting to only ten dollars or so, and that level of usage is incredibly rare. The average has been about ten cents of usage per student.) |
Aha, I see! I’m on board with the second idea of having a global query limit for the whole class in the classes table instead of setting individual limits for students. It feels simpler and way less messy to manage. As for storing query counts, I don’t have much experience with this, so I’ll trust your judgment. Your point about keeping the database as the single "source of truth" sounds logical, and if the performance checks out, I’m sure it’s the right way to go. I think adding a query limit is a good idea, especially for professors with a fixed budget who want to avoid going over. It seems like a better option than having them manually set usage limits in their OpenAI developer account. For example, the professor I’m working with is planning to use this tool in a MOOC (a big online course), and without limits, both new and existing students would have unrestricted access to the models—which might not be the best setup right off the bat. Lastly, would you prefer to implement and test these ideas yourself, or do you want me to do it? |
Okay, great! For a MOOC, I can see how setting limits in advance would feel most comfortable. I would bet no one would abuse it even without limits (there's not much to gain or even be entertained by when you're restricted to CS-help responses only). But it should be simple enough to add with the design we've discussed. Do you know whether the plan is for you to self-host CodeHelp for the MOOC or to use my hosted version at codehelp.app? Using codehelp.app should be alright -- I think it can scale to MOOC size okay -- but it would be helpful to know in advance and talk through the expected load in advance. Feel free to email me separately about that if needed. As for who implements it, I'm fine either way, but I might not get to it quickly myself. It would be helpful if you can get a proof of concept working in this PR, and then I could merge it and polish things from there myself. |
The plan is to do a "soft" launch with a small class that takes place here at Dartmouth College using your hosted code help. If that goes well, we plan on hosting the application ourselves and using Dartmouth's proprietary AI models that are hosted instead of the OpenAI API. My professor will email you once we have a solid plan! |
Got it. For using self-hosted models, if they have an OpenAI-compatible endpoint then it should be very easy to integrate that. I made some changes in December to make the code more LLM-provider-agnostic, and I plan to do more in that direction. With just a few more changes (in the works), you could even use self-hosted LLMs from codehelp.app if you wanted to. Okay, but I'm getting off-topic from the PR here... 😄 |
Wow, that's awesome! We got Dartmouth's Codellama and Llama completion endpoints working, but it wasn’t exactly smooth sailing; we ended up breaking a few things we probably shouldn’t have. But hey, it was just a proof of concept, and it's another reason that having a built-in query limit would be super helpful so we don’t accidentally overload our local servers by giving students free rein. It’d be amazing if the official app had support for this that would save us a lot of hassle with hosting our own setup! |
This update enables instructors to manage and monitor student AI query usage. Instructors can set a maximum query limit for each student, view individual usage statistics, and reset query counts as needed. The feature was actually requested by my Professor that I am working with. It includes user interface components for configuring limits, tracking usage, and performing reset actions, and lets instructors have full oversight of query consumption to manage the costs and prevent misuse.
by students.