Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concerned about moving parts of modules into Lua #3128

Open
pjsg opened this issue May 26, 2020 · 21 comments
Open

Concerned about moving parts of modules into Lua #3128

pjsg opened this issue May 26, 2020 · 21 comments

Comments

@pjsg
Copy link
Member

pjsg commented May 26, 2020

One of the nice things about nodemcu-firmware is that it allows you to build small apps very easily with just esplorer/similar. You build an image with the cloud builder, flash it, and then you can build something.

However, as modules start to be moved into Lua, this means that you can't do this stuff any more as you need to use the LFS to fit all the code into memory. This raises the complexity of the toy development environment significantly.

What could be solutions to this problem? Maybe the cloud builder could build an LFS partition with all (or selected) Lua libraries that are part of nodemcu-firmware, and include this partition in the firmware file that the user then flashes onto their boards.

Comments?

@nwf
Copy link
Member

nwf commented May 26, 2020

I raised a similar point in the wrong repo at marcelstoer/docker-nodemcu-build#91. Since this is probably a better place for the conversation, I'll close that one and repeat what I said there here:

-----8<-----

It would be useful, I think, if we could pick at least in-tree lua_modules directly into a LFS image as part of the firmware bundle. This would lower the friction of jettisoning some "somewhat hairy" C modules in favor of Lua+C hybrids as I've been trying to drive us to (e.g. #2819 and now #3126).

However, I understand if there's hesitation to do LFS builds because people might want to add their own modules, and that opens "now I need to let people upload files" which is always a little hair-raising on the web. So...

Maybe instead of LFS, landing selected lua_modules in SPIFFS would still be useful. They could still be require-d as is, at the cost of increased memory consumption, and given @TerryE's proposal in #2917, having Lua files in SPIFFS would eventually let us tell people how to compile their LFS images on-module.

-----8<-----

I know I'm being a little pushy on this, but to repeat myself: we have not demonstrated the ability to write and maintain complicated modules in C. In many cases our modules merely appear to work or work well enough for toys but not for anything robust; memory management errors abound. Fixes for these issues have been very slow in coming (which, I acknowledge, is partially my fault as I would rather not fix them in C).

I think finding a way for the builder to add in-tree Lua modules to SPIFFS at least gets us to the point of "it's easy to run" again, at the possible cost of chewing up more RAM while so doing. (Though it would be very useful to measure roughly how much more RAM we're talking. The proposal in #2819 surely can't use that much, he says, naively... and the outline in #3126 suggests leaving the heaviest parts of the data structures in C.)

The lack of a satisfactory answer for "easy way to get an image including SPIFFS/LFS" is partially why I haven't been more adamant that #2819 get merged despite its now near feature- and usability-parity with the C sntp module.

@HHHartmann
Copy link
Member

HHHartmann commented May 26, 2020

I also think that using a C/Lua module must not be any more difficult than using a plain C module.
Select it in the configuration file and have it available in the FW.
I have been locking in how LFS images are flashed to memory.
(I hope I get this half way correct)

  1. leave out the decompression here
  2. check version, palusability and size of given file
  3. replace relative adresse with absolute adresses
  4. write LFS to partition
  5. node.flashindex(funcName) then runs the main function to select the required module.

It should be possible to do something similar in the build process.

  1. use luac.cross to generate an LFS image for each required Lua partial module.
  2. Decompress and include it in the firmware (possibly as a byte array).
  3. use the map file to find it in the bin file.
  4. change the relative addresses to absolute addresses.
    • by implementing another copy of the algo already implemented in the FW load and in luac.cross.
    • by building the LFS image again but as absolute address version, using the given address. (-a) and the replacing it in the image.

Maybe step IV above could be simplified by directly retuning just the one module. But that would require to change luac.cross and can wait till later.
Alternatively all Lua parts of hybrid modules could be stored in one LFS image, but the logistics of that would be more difficult to handle.

A separate Implementation of this for Lua 5.1, 5.3 and likely the ESP32 tree would be needed.

The modules might then be "auto required" as the C modules are.

@TerryE am I completely wrong or is this halfway thinkable.

@HHHartmann
Copy link
Member

Argh. Forgot about ROTables and ROStrings.

@nwf
Copy link
Member

nwf commented May 27, 2020

(As an aside that should not detract from the main issue here: AIUI, Lua 5.2 shifted away from "modules register themselves in the global namespace" and I'd kind of like us to do the same. It's already true that we tend to write things like local tmr = tmr for performance, and local tmr = require "tmr" seems not any worse. We can hook _G.__index to provide deprecation warnings as a transition strategy.)

@TerryE
Copy link
Collaborator

TerryE commented May 27, 2020

I write Lua code on a PC differently that on the ESP. I don't really need to worry about memory use in coding a PC as any app that I write will have 10-100 × less memory needs than I typical app like Word or Firefox.

If I am using LFS then the only difference from writing on a PC is that I now need to be careful about data use as I now only have 45Kb (or less if C modules need a bit), but other than that code and
constant space isn't too much of an issue (at least for the aps that I write). I can lay out the code for clarity and simplicity. All compilation is done on the PC

If I run Lua from RAM or want to do source loading, then immediately things start to get really complicated as this 40-45Kb now has to be enough for data, code, and any compile overheads. I find this a pain for the simple apps that I cut, and so IMO it is really easy for a new IoT developers to hit these limits and the coding techniques that you need to deploy to mitigate them are really quite advanced. OK, having a decent well written tutorial and FAQs on how to use these techniques would really help, but even then the new developer needs to take the time to read them.

So my tl;dr so far is that I broadly agree with @pjsg Philip that in general if we only provide C/Lua hybrid modules then LFS quickly becomes essential. Scaling RAM-base Lua apps is difficult.

However, I also agree with @nwf Nathaniel that Lua is simpler language and its memory management feature make it a better vehicle for bulk applications code. The leaner and meaner better for C modules, IMO. But this again tends to make LFS unavoidable.

IMO, the major issue with LFS and beginners is that we don't make available a prebuilt Windows command line version of luac,cross as standard. We have a Travis host builds target which actually validates that we can build a Windows luac.cross image. Why don't we just keep one and publish it every time we update dev or master? I really can't understand why one of you, @HHHartmann or @pjsg say, hasn't just filled in this last piece. This would really to script IMO for any committer here who is largely Windows based. (I am Linux only, but I will wiling support anyone doing this.)

The techniques that I discuss in #3117 have a very limited scope: to allow Lua developers to spin up an FTP or telnet server (or possibly other previsioning service on what is a bare module with just a cloud builder image loaded. Getting some of these running within RAM limits involves some advanced Lua but since this code is canned, the user doesn't need to know all this.

@HHHartmann Gregor, I am not sure what your summary if for. It is incorrect in some details. I think that this underlines we could do with a simple explanation of the way of storing Lua code in terms that someone new to IoT could understand, but I feel that this dialogue lies outside this particular issue.

@nwf Nathaniel the reason for doing techniques like local tmr=tmr came of something that Philip raised in #1590, and that is that the eLua-based ROTable access could easily be 50-100 × slower than the equivalent table access. I've been chipping away and introduced a ROTable access cache
so that the overall ROTable overhead is typically less than 2× so this sort of local caching is really only worthwhile if your app is going to use this local copy 100s of times.

In general there are all sorts of bear traps in using environments so doing this is really for advanced programmers. I use luac -l analysis with some filter scripts + luacheck on my source to lake sure that I am using locals and upvals, and only create global variables on an exceptional basis.

@marcelstoer
Copy link
Member

Thanks Philip for raising this issue. I guess you won't be surprised about where I stand on this. I have been advocating for and working towards low entry barriers for the average user all my NodeMCU life. Hence, I share your concerns.

So, my fundamental goal is this: never make things more complicated - for the user/developer - than they were.

This proposal seems sane from a technical perspective. However, I feel we should first take measures to ensure the build & flash process is as (or more) painless than it is now before moving away from the current paradigm.

@HHHartmann
Copy link
Member

I got a working proof of concept of an LFS which is embedded in the firmware image and can be accesses just like the regular LFS (using node.flashindex2).
At the moment there are two scripts, one to be run before make and the other after that.
But I am sure they can be added to the build process.

So there would be an easy way to add ftp and telnet, but also the Lua part of Lua+C modules.

My Branch would be https://github.com/HHHartmann/nodemcu-firmware/tree/embedded-lfs
Currently the scripts generate several diagnostics files. That part will be removed later on.

@nwf
Copy link
Member

nwf commented Jun 3, 2020

That's an interesting idea, and is probably simpler than merging LFS images... but.

I don't think I like flashindex2. Add a second parameter to flashindex, so that it's node.flashindex([module], [lfsindex]) where module is a string (as now) and lfsindex an integer; if flashindex is just given an integer, it returns metadata about that LFS. Similarly for flashreload(imagename, [lfsindex]) to support replacing either one? Though that latter proposal might get more complicated by the next question...

More technically, how do you resolve duplicate entries in the LFS string tables? That was the original reason I suggested merging LFS images "on the fly" on the device.

@TerryE
Copy link
Collaborator

TerryE commented Jun 4, 2020

I got a working proof of concept of an LFS which is embedded in the firmware image and can be accesses just like the regular LFS (using node.flashindex2).

@HHHartmann Gregor, on first reading I misunderstood what you are proposing here. That is because when we've previously talked about 2 LFS partitions, this was in the context of on-ESP LFS building where the loader would need an "existing" and "new" LFS partition during the load process.

(Incidentally IMO, we need to start to deprecate the Lua 5.1 environment by doing a feature freeze, including this type of development. Any new features such as this should be offered on Lua 5.3 only)

What you are now talking about is having 2 live LFS partitions in the runtime:

  • a "system" LFS for standard Lua modules that is built during the make process and comes as part of the firmware image, and
  • a separate application LFS that can be reflashed at runtime.

As you show the runtime overhead is quite small (instead of resolving (new) strings against the RWstrt you now resolve them against the RWstrt then the application ROstrt then the system ROstrt. However I think that we should be discussing the features and functionality here rather than implementation details.

  • Do we want to do this? What are the benefits? What are the disadvantages?
  • Given that this is the standard build technique that DiUS use (albeit with one LFS) that is the LFS is built as part of the make (with the -f -a option) and assembled into the image, we could in principle do this.
  • However with 2 LFS regions we do have a few things to think about.
    • Strings are interned so application LFS images should not contain strings that are in the system LFS (otherwise strings equality and keys in tables could break). This means we either need to drop the interned constraint and add extra runtime equality checks on strings, or luac.cross needs to be extended to prescan a system LFS ROstrt. Lots of bear-pits here.
    • Name resolution. Like Nathaniel I don't like the idea of two index functions. One should search the app LFS first then the sys LFS (so that devs can functionally overwrite a sys LFS function.
    • Reloading the system LFS. This gets messy because of the interning issue and also because I assume that the system LFS will be right sized. Again bear-pits.

I just feel that making a native Windows luac.cross executable available is surely the simplest choice for most developers. I repeat this request and suggest that if we want to promote hybrid Lua/C modules then we need to have simple and robust ways of making them available to new developers.

@nwf
Copy link
Member

nwf commented Jun 4, 2020

This all seems very complicated and I don't think it has to be. I would like, and can try to make it so that, our build process...

  • consumes, somewhere internally (ultimately derived from WIP: Kconfig #3138, but in the interim we can do something else), a list of Lua files as well as a list of C modules
  • generates the existing .bin full of the SDK and our C goo
  • generates a LFS from those selected Lua files
  • generates a SPIFFS image with the compiled versions of those Lua files under a common pattern (lfs-%w*.lc, perhaps?)
  • stitches the C-goo .bin, LFS, and SPIFFS components together into a single flashable .bin

Armed with that, the basic flow for novice users is a delightfully simple three-stage process almost indistinguishable from the current state of affairs:

  • Select some modules (e.g., on the cloud builder, using Kconfig, or by modifying files like now)
    Internally, this generates both the list of C modules and the list of Lua files to land in the firmware.
  • make, either run by the developer or by the cloud builder, cranks out a .bin for the developer.
  • The developer lands their glorious .bin onto the device. Glorious profit.

The sole complexity here is that we have now occupied the LFS slot with nodemcu-firmware-provided goo. But that's OK, because we can make the "I want to add myfoo to the LFS" flow as simple as...

  • Either:
    • Cross-compile and transfer lfs-myfoo.lc into SPIFFS, next to the other lfs-....lc files, or
    • transfer myfoo.lua and compile on the device to generate lfs-myfoo.lc
  • Use node.loadflash("lfs-%w*.lc") as per Unified loading and saving code to RAM and LFS #2917.
  • Wait three reboots while the module sorts itself out.

There. That hardly seems world-endingly bad. There's little need for anything new other than loadflash and the "squash all the build artifacts together" make step.

But we should still offer lua.cross.exe for our Windows-using friends to help with the "I want to add" flow above, and we should still support host-side generation of entire LFS partitions like we do now, for advanced users wanting simpler ahead of time provisioning.

@HHHartmann
Copy link
Member

(Incidentally IMO, we need to start to deprecate the Lua 5.1 environment by doing a feature freeze, including this type of development. Any new features such as this should be offered on Lua 5.3 only)

Sure this will have to be ported to Lua 5.3 and maybe even ESP32. I saw the Implementation is different, but hope that the general idea will work there too.

What you are now talking about is having 2 live LFS partitions in the runtime:

  • a "system" LFS for standard Lua modules that is built during the make process and comes as part of the firmware image, and
  • a separate application LFS that can be reflashed at runtime.

Correct.

  • Do we want to do this? What are the benefits? What are the disadvantages?

I see the main benefit in that we then can add Lua modues to the firmware just as we can add c modules while the application developer can have full control over his (own) LFS partition.

  • However with 2 LFS regions we do have a few things to think about.

    • Strings are interned so application LFS images should not contain strings that are in the system LFS (otherwise strings equality and keys in tables could break). This means we either need to drop the interned constraint and add extra runtime equality checks on strings, or luac.cross needs to be extended to prescan a system LFS ROstrt. Lots of bear-pits here.

Building an LFS image against a firmware sounds like no good idea. An LFS image should work independent of a certain firmware configuration (the required modules have to be in of corse)

Maybe the application LFS could be checked for duplicate strings on reflash, just excluding the duplicate ones.

  • Name resolution. Like Nathaniel I don't like the idea of two index functions. One should search the app LFS first then the sys LFS (so that devs can functionally overwrite a sys LFS function.

I fully agree. But I think this should be done in Lua, just extending the code in _init.lua. In my _init.lua I also first search SPIFFS to be able to override LFS.
I just chose this interface because it was easiest to implement well knowing that we would discuss this further.
Adding a second parameter to node.flashindex and not to node.flashreload might lead to confusion however. Maybe something like node.FWFlashIndex or node.InternalFlashIndex. But I don't really feel strong about either.

  • Reloading the system LFS. This gets messy because of the interning issue and also because I assume that the system LFS will be right sized. Again bear-pits.

I did not plan to reload the system LFS. There is no way to reload C libraries either.

I just feel that making a native Windows luac.cross executable available is surely the simplest choice for most developers. I repeat this request and suggest that if we want to promote hybrid Lua/C modules then we need to have simple and robust ways of making them available to new developers.

I agree that it sure would help to have the binary somewhere available.
But I still think that it is too complicated to manually copy the needed Lua files from the source tree for that audience.

  • you need to repeat it for (potentially) every new FW version
  • you need to know which Lua files you need for which c core module.
  • you need to find out how to access them (clone/download repository or download individual files from website)

@nwf
Copy link
Member

nwf commented Jun 4, 2020

@HHHartmann I am curious for your opinion on my (I think much simpler) proposal, which does not require finding, selecting, or copying files as we land them in the SPIFFS at build time.

@TerryE
Copy link
Collaborator

TerryE commented Jun 4, 2020

Don't underestimate this issue of (short) strings being interned. On the one hand, I really don't want to introduce the performance hit of doing a strcmp() for string equality, but if we don't then we would need to remove duplicate strings on app LFS load. This is easy to do in the case of the Lua53 image format, but not in the case of the Lua51 format. (Though this would still break a -a load without further work.)

I've already backported a lot of Lua53 improvements into Lua51 -- basically those needed to allow the app/modules to work across both versions. I don't want to have to do a load more backports, so if we really want to go the sys+app route then I strongly advise that we do this for Lua53 only.

The alternative is to land them in SPIFFS but I suggest we can also do this at configure time. A single fast.getmodule() could download the specified module direct from GitHub. The downside of this is that they would run from RAM, not Flash, so FTP takes over 20Kb (and isn't even loadable in Lua format) and the new sntp module takes 10Kb. However running fast.split() on FTP drops this to about 4Kb (and only when the FTP service is running). (Do a git diff HEAD^1 -- lua_modules/ftp/ftpserver.lua to see how this module is overlay enabled.)

I can't think of any simple magic bullets here.

@nwf
Copy link
Member

nwf commented Jun 4, 2020

@TerryE: I think the solution sketched in #3128 (comment) is a pretty straightforward magic bullet?

@TerryE
Copy link
Collaborator

TerryE commented Jun 4, 2020

@TerryE: I think the solution sketched in #3128 (comment) is a pretty straightforward magic bullet?

String Interning -- uuuaarrghhh

The nub of this issue is that if the sys LFS contains the string "print" say, then "print" will be in the sys ROstrt. If the app LFS refers to the string "print" it has to acquire the sys ROstrt instance on flashreload(); it must not create a second copy in its app ROstrt.

The alternative of doing this deduplication during the luac.cross requires the compiler to be able to load the string metadata in the firmware image and this is a fundamental architectural change.

This approach really requires the Lua53 image dump format which was designed to support this type of merge (as it happens to facilitate on ESP LFS building, #2917, but it works here as well). If we want Lua 51 to support this as well as Lua53 then I am going to have to backport this Lua53 dump format.

@TerryE
Copy link
Collaborator

TerryE commented Jun 4, 2020

My suggestion is that we put the implementation of this on the back-burner for a month or so. I would really like to get the next master drop out of the way and my next #3078 work dumped from my working branch into a PR for merge into dev first.

@nwf
Copy link
Member

nwf commented Jun 4, 2020

@TerryE: I... believe you read the wrong comment, as the one I pointed you at no longer deals with two LFSes.

@TerryE
Copy link
Collaborator

TerryE commented Jun 4, 2020

Yup, but this really needs the on-ESP delta load. A lot of common ground 😄

@TerryE
Copy link
Collaborator

TerryE commented Jun 5, 2020

Sorry, let me expand on my last comment. @nwf, your scenario delivers a firmware with a single LFS that is pre-populated with any selected Lua modules and examples. If beginners wanted to extend this with their own application modules, then they would need #2917 if they don't have luac.cross installed locally, and this is currently planned to be supported with Lua 5.3. IMO, this makes implementing #2917 a priority.

@TerryE
Copy link
Collaborator

TerryE commented Jun 9, 2020

I've been thinking more about @HHHartmann's above suggestion of dual LFSs and now think that this is going to be quite workable. We would want to support two variant of boot modes:

  • boot using a single LFS: LFS1 or LFS2, and
  • chained mode where LFS1 (sys) the LFS2 (app) are enabled.

The former allow safe loading of a step-up LFS where the last thing the load process does is to toggle the RCR selector for LFS. The latter allows the sys+app concept discussed above.

Both modes are a destructive update of the second LFS (otherwise we would need 3 LFS partitions during load which is just getting too hairy).

Both require the LFS loader to resolve string duplication across LFSs (in mode 2 only) on the fly during load. On reflection this is straightforward -- that is once we accept that we need to do it.

The standard Lua dump(LC) format only contains a single top level function (TLF), and loading it returns a single function data type. If the LC is being loaded into LFS (or in fact we could extend this generally) then it could contain a concatenation of multiple TLFs. In this case the load in essence returns a table of {modname=func} where the modname is based on the sourcefile basename.

So long as the LC formats are valid and the total fits in the LFS region then any wildcard of LC files will load into an image. I suggest that I make this the next PR after the error handling one.(*)

(*) unless bumped by some priority review changes.

@HHHartmann
Copy link
Member

Still no progress here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants