-
-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ChatGPT #390
Comments
So good news! You can actually do this yourself,
Where you change the values to get the result you want. The model itself is COCOMO https://en.wikipedia.org/wiki/COCOMO so you can look at how it works to get an idea. You can couple this with If you come up with a custom type that works as you would expect, pick a good name for it (current ones are organic, semi-detached, embedded) and I can include it in as an option. It sounds like you might be suggesting to get a better model though. I am open to this idea, but I need to see how the model is actually implemented. COCOMO was my first choice since I wanted to replicate the functionality of sloccount https://dwheeler.com/sloccount/ and because the model is fairly easy to implement. However I am very open to having multiple calculation models in the tool if it gives better results. Of course I need an example of how to implement it first. I did investigate COCOMO II for a while but could never find an implementation to copy from. I guess we could develop our own based on the same values that COCOMO uses but factor in complexity and perhaps even the language as a scaling factor but I would prefer to get something that already exists. @dataf3l If you feel like finding some weights that fit to your expectation across multiple projects post them here, I will try across a some and see what shakes out. If it seems good you get the glory of picking a new name. If however you find a better model I will be happy to add in the ability to flip from COCOMO to something else. Im also happy to do both if you find that a better option. |
AI should reduce the cost of development: https://www.youtube.com/watch?v=VErKCq1IGIU&ab_channel=MOVClips I kinda expected to be told to go change the avg-salary variable, I kinda knew that, you see, what I'm trying to point out, is that I think the world needs a "cocomo.ai" or something similar based on these (new) assumptions:
So, in the short term, we can expect gains, but what about the long term? https://www.youtube.com/watch?v=GFiWEjCedzY&ab_channel=DisneyLivin Are people going to have to be a master wizard to fix the bugs introduced by the AI, will that impact maintenance cost? So with all these things in mind, I think this may affect SCC project, and others like it like CLOC and friends. I think AI will bring a new era of Software, where developers are more abundant, since more people will become WIll surfing waves of code, become the new normal? how will tools evolve to rise to this challenge? I think there is a need for a new costing model, if it's called "cocomo III" or "cocomo.ai" or whatever is not the main If developers ask the AI to build the code, go to sleep, and then wake up 8 hours later to review the code written and tested by the AI, and find that the code is OK, and ready to go, should they charge 8 hours? was that 8 hours of labor? should that be priced in? if developers need a huge computer to run a local model, like LLama or Alpaca, should the cost of the GPUs be taken into consideration? are slow laptops no longer usable for development in the new AI age? will there be an abundance of these GPUs? I think that the tools need to adapt to the new reality, I don't have a plan on how to accomplish a more realistic model, but here are some numbers: If things like Copilot and ChatGPT make developers more productive, how can that be quantified into SCC? If copilot generates more repetitive code, and the developers are no longer incentivized to make a beautiful masterpiece, but will rather "crank out repetitive code" because that's what the AI gives and they are too lazy to change it (they are developers after all), does this mean "more code" does not equate to "more time" ? What if a developer asks in the prompt explicitely "no comments please", or "make it terse", will this out-of-distribution prompts on the code request cause the number of lines of code to become a meaningless metric ? Just some things I've been thinking about. @boyter I give you instead the honor and glory to name it whatever you desire, but, will you please help this new, confused world, with a new version of scc, which uses dates, perhaps also a statistical model that will predict if a line of code was-it-made-by-ai-or-not thingie, so we can better assess what was made by AI (cheaply) and what was made by humans (costing dearly?), so we can truly get to a truer perspective of real(er) cost? These, I think, are the new challenges. |
I'm going to park this, in the sense I am not actively developing for it. I am not saying no. I am saying not right now. Simply because the landscape is moving too quickly at the moment to start implementing. Waiting a few months seems like a decent approach. I am keeping a close eye on the state of things however the following still appears to be true. Lines of code is still a metric that can be used to inform. Either because at some point someone needs to maintain or understand the code, and as such its still something that has meaning. That and that number of lines is an indicator of where there might be issues. The cost question is one to consider, but again its estimating the cost to develop based on the assumption a person wrote the code. If there is a way to model code based on the assumption a computer wrote it, then we can use that (assuming one exists) and if we can determine who wrote the code that would allow a better breakdown. Until we have the following,
There is little that can be done right now. I am keeping an eye on both of these. Its possible there will need to be a new tool to deal with these problems and have that feed into scc but that remains to be seen. |
I agree that the landscape moves quickly. Lines of code can be summarized, refactored, expanded, that's a new thing, it wasn't there before. This tool can detect GPT2 it uses the model weights to determine if it was generated or not, OpenAI has also released a tool to determine if stuff came out of GPT3, but as time goes by, and tools improve, this task will become harder and harder. https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text Also, there will be multiple models, so there is that. maybe we can just put in a multiplier for lines identified as non-written by humans. For example, a million lines of code XML file probably wasn't done by hand. Nobody writes their package.lock.json for example. So maybe the variability of the file itself, it if looks very repetitive, perhaps it's an output, not human made code. libraries, like node_modules, /vendor, /venv, etc should be discarded as well. I think even though we can't solve the 100% of the problem, some heuristic on how to evaluate and discard files maybe another tool is required to discard/categorize files, which can be used as an input for scc? And another thing, thank you for setting expectations from the start, I appreciate it. |
To solve the package.lock and other such files I strongly suggest using a .ignore file which will be respected similar to tools like ripgrep, silver searcher and such to ignore. I have debated adding vendor and such as a default to ignore, but I think its a default that would surprise the user hence did not include it. I am in the middle of refactoring scc to have proper support for all features of gitignore and ignore files but thats still a little away. Avoid globs for the moment and it works though. As for determining if it was written by GPT that tool looks interesting. I wonder if you could use it to append a comment on files, or dynamicly update the ignore file to support this. You can use the various scc options to disable looking at certain files through the use of |
Describe the bug
With the Advent of ChatGPT, is it possible the estimated cost to re-build things will change?
Should we change then the estimates generated by the tool? to take into consideration things
like Copilot, ChatGPT, etc?
To Reproduce
ran scc on a very short effort source code and it claims it took 3 months (not the actual case).
Expected behavior
a more realistic estimate
Desktop (please complete the following information):
The text was updated successfully, but these errors were encountered: