-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attributes/tags search #36
Comments
Good ideas, thank you! Could you contrast that with how other citation tools implement the advanced search for citation insertion/paper exploration?
The entire collection gets downloaded locally and stored in jupyterlab-citation-manager/src/zotero.ts Lines 122 to 273 in bc366a8
It speaks CSL JSON as defined in https://raw.githubusercontent.com/citation-style-language/schema/master/schemas/input/csl-citation.json and https://raw.githubusercontent.com/citation-style-language/schema/master/schemas/input/csl-data which means that parsing dates is... challenging. I think there is some normalizaiton to make it more palatable elswehere in the codebase. The JSON is then filtered and sorted in various selectors which implement jupyterlab-citation-manager/src/components/selector.tsx Lines 28 to 34 in bc366a8
The default model currently does simple filtering based on title, year, authors, and sorting based on the three + number of citations in the current document to break ties: jupyterlab-citation-manager/src/components/citationSelector.tsx Lines 136 to 213 in bc366a8
This needs writing some unit tests. |
Hi Mike, sorry for the late reply - I will investigate the other reference managers after the 15th (cob). Thank you for the explanation - this is really fascinating! Is the Also, could the references inserted into a notebook be dumped to a |
I'm pretty sure I can, what do you want to know? |
Hi @retorquere, thank you for the prompt answer! Full disclosure: I am a newbie in reference managers - I use this Here's the thing: As a side note, I was also curious to know how you store data: as @krassowski explained above, Thank you! EDIT: there's also this interesting thread on gitter and should/might be related with #15 and #8 I guess |
It's probably using BBTs JSON-RPC search endpoint, and that passes the work to Zotero quicksearch, which should search on all fields & tags. I'm not sure what differentiates search on "all fields and tags" and "everything" on Zotero, but I'd guess that "everything" includes attachment content.
I don't do CSL-JSON date parsing, but I do produce CSL-JSON dates, and they appear to me to be very well-defined structured objects - there really isn't anything to parse in CSL dates AFAICT. Do you have a sample of a hard-to-parse CSL date, @krassowski? Parsing free-from dates into CSL is another matter. The BBT date parser is a few hundred lines of code on top of two pretty large EDTF-parsing libraries.
I have an sqlite db in de zotero data directory for most BBT data, and a bunch of JSON files for the caches. These can only be read when Zotero is not running; Zotero locks sqlite databases while it is running, and BBT reads-and-deletes the caches to make sure that if an error occurs that prevents saving the cache would not lead to stale caches being read on next startup; it's better to start with an empty cache (which is a self-repairing situation) than a stale cache. The caches are written back out when Zotero shuts down.
Several ways in fact:
I don't know what the topic under discussion is there. |
Thank you for your prompt and exhaustive reply!
That I remember, I had a look inside my
From Zotero, right?
Do these pull-export the whole collection or just the files inside the article/publication?
Ops, sorry: this is more related to exporting to MyST markdown formats. I am drifting off-topic, I guess I should move this discussion to somewhere else. In the meantime, thank you for the answers, let's wait and see if Mike has something to comment upon. |
EDTF strings are valid entries of date variables in CSL-JSON schema as is the "structured" form which may take anything between one and three parts which may be strings or numbers and for which the meaning is not very well documented; then you have the extra fields like circa, season, etc and multiple date fields (some records publication date, creation date, etc.; one of those is mandated by CSL if I recall correctly but the thing is it often only contains the year part and all the details are in the other fields and how they are populated appears random to me after looking at a large sample of records from Zotero). |
This is tracked in the other issues you mentioned - let's keep this issue focused on the search capabilities ;) |
No, it is more of an in-browser cache for JupyterLab; it is not codified into Jupyter at all. And it might change at any point. |
I can't figure out the UI for this: do you just write something like
But you basically query from it, right? So entries have fields and the matter here is just to find a UI query-style consistent with what other citation managers offer, am I correct? |
Thank you @retorquere for your generous advice and explaining how BBT works. From that, I gather it relies on a local Zotero installation and access to its local API to pass on the search tasks; @baggiponte as also discussed in #37 this is really not the feasible path for this extension for several reasons:
|
Yes, the entries follow |
Oh yeah that's complex so I don't bother doing it myself, I outsource that to a library.
I find these not too hard to process, but TBH I don't support all possible combinations. What I can sensibly output is constrained by the target format (bibtex and biblatex) and since biblatex supports edtf, I just forward whatever is deemed (by said library) to be valid EDTF.
I don't use the Zotero date parser, BBTs date parser differs significantly from Zotero's.
Correct. BBT is only available in the Zotero client.
With files you mean attachments? There's not yet an RPC-JSON endpoint for that. You can pull down bibtex or biblatex from the endpoint.
I don't translate it at all; I just pass the text on to the same code that handles the quick search above the item list in Zotero, and return the results.
correct
There's ways around that, but they're not convenient. I have a branch where I work on a BBT that doesn't need the client, but I have no ETA on that beyond "not soon".
correct.
Only pain lies that way. You most certainly never want to write to the DB directly.
Not a great fan of flatpack et al, but I see the appeal |
The extension allows to (fuzzy) search only by the title of the pages; it would be interesting to see tag/field/attribute search, like in Gmail or GitHub issues:
date:YYYY-MM-DD
to filter papers of a certain periodauthor:Surname,Name
And so on. Also, it could be interesting if there could be a way to disable fuzzy searching (say, by typing
\
at the start of a line).I don't understand how difficult this would be to implement, as I have not properly understood how the extension works, if it's similar to BetterBibTex (which I don't really know a lot about) and/or if it dumps the citations to a json/sqlite which then is queried. I guess this just sends a request to Zotero via the API?
The text was updated successfully, but these errors were encountered: