Skip to content

Commit 052d661

Browse files
authored
Dialektik prompting
1 parent 5b6ab2e commit 052d661

14 files changed

+156
-109
lines changed

README.md

Lines changed: 5 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,6 @@ Phi-3-MLX is a versatile AI framework that leverages both the Phi-3-Vision multi
1919
Phi-3-MLX is designed to run on Apple Silicon Macs. The minimum requirements are:
2020

2121
- Apple Silicon Mac (M1, M2, or later)
22-
- macOS 11.0 or later
2322
- 8GB RAM (with quantization using `quantize_model=True` option)
2423

2524
For optimal performance, especially when working with larger models or datasets, we recommend using a Mac with 16GB RAM or more.
@@ -85,7 +84,7 @@ prompts = [
8584
]
8685

8786
# Define constraints for the generated text
88-
constraints = [(0, ' The'), (100, ' The correct answer is'), (1, 'X.')]
87+
constraints = [(0, '\nThe'), (100, ' The correct answer is'), (1, 'X.')]
8988

9089
# Apply constrained beam decoding
9190
results = constrain(prompts, constraints, blind_model=True, quantize_model=True, use_beam=True)
@@ -127,49 +126,13 @@ generate("Describe the potential applications of CRISPR gene editing in medicine
127126
quantize_model=True,
128127
use_adapter=True)
129128

130-
# Compare LoRA adapters
131-
test_lora(adapter_path=None) # Without LoRA adapter
132-
test_lora(adapter_path=True) # With default LoRA adapter
133-
test_lora(adapter_path="/path/to/your/lora") # With specific adapter
129+
# Test the performance of the trained LoRA adapter
130+
test_lora()
134131
```
135132

136133
![Alt text](https://raw.githubusercontent.com/JosefAlbers/Phi-3-Vision-MLX/main/assets/train_log.png)
137134

138-
## 2. HTTP Model Server
139-
140-
1. Start the server:
141-
142-
```
143-
python server.py
144-
```
145-
146-
2. Send POST requests to `http://localhost:8000/v1/completions` with a JSON body:
147-
148-
```bash
149-
curl -X POST http://localhost:8000/v1/completions \
150-
-H "Content-Type: application/json" \
151-
-d '{
152-
"prompt": [
153-
"Hello, world!",
154-
"Guten Tag!"
155-
],
156-
"max_tokens": 50
157-
}'
158-
```
159-
160-
3. Receive JSON responses with generated text for each prompt:
161-
162-
```json
163-
{
164-
"model": "phi-3-vision",
165-
"responses": [
166-
"Hello! How can I help you today?<|end|>",
167-
"Guten Tag! Wie kann ich Ihnen helfen?<|end|>"
168-
]
169-
}
170-
```
171-
172-
## 3. Agent Interactions
135+
## 2. Agent Interactions
173136

174137
### Multi-turn Conversation
175138

@@ -218,7 +181,7 @@ agent.end()
218181

219182
![Alt text](https://raw.githubusercontent.com/JosefAlbers/Phi-3-Vision-MLX/main/assets/api_agent.png)
220183

221-
## 4. Custom Toolchains
184+
## 3. Custom Toolchains
222185

223186
### In-Context Learning Agent
224187

@@ -310,24 +273,6 @@ benchmark()
310273

311274
*(On M1 Max 64GB)*
312275

313-
## More Examples
314-
315-
For advanced examples and external library integration, see `examples.py` in the project root. Preview:
316-
317-
```python
318-
# Multimodal Reddit Thread Summarizer
319-
from rd2md import rd2md
320-
from pathlib import Path
321-
import json
322-
323-
filename, contents, images = rd2md()
324-
prompt = 'Write an executive summary of above (max 200 words). The article should capture the diverse range of opinions and key points discussed in the thread, presenting a balanced view of the topic without quoting specific users or comments directly. Focus on organizing the information cohesively, highlighting major arguments, counterarguments, and any emerging consensus or unresolved issues within the community.'
325-
prompts = [f'{s}\n\n{prompt}' for s in contents]
326-
results = [generate(prompts[i], images[i], max_tokens=512, blind_model=False, quantize_model=True, quantize_cache=False, verbose=False) for i in range(len(prompts))]
327-
with open(Path(filename).with_suffix('.json'), 'w') as f:
328-
json.dump({'prompts':prompts, 'images':images, 'results':results}, f, indent=4)
329-
```
330-
331276
## Documentation
332277

333278
API references and additional information are available at:

api.py

Lines changed: 5 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -3,65 +3,35 @@
33

44
from huggingface_hub import InferenceClient
55

6-
# def mistral_api(prompt, history):
7-
# """
8-
# Example:
9-
# --------
10-
# agent = Agent(toolchain = "responses, history = mistral_api(prompt, history)")
11-
# agent('Write a neurology ICU admission note')
12-
# """
13-
# history = '<s>' if history is None else history
14-
# history += f"[INST] {prompt} [/INST]"
15-
# client = InferenceClient("mistralai/Mistral-7B-Instruct-v0.3", token = os.environ.get('HF_READ_TOKEN', False))
16-
# generate_kwargs = dict(
17-
# temperature=0.9,
18-
# max_new_tokens=1024,
19-
# top_p=0.95,
20-
# repetition_penalty=1.0,
21-
# do_sample=True,
22-
# seed=42,
23-
# stream=False,
24-
# details=False,
25-
# # details=True,
26-
# return_full_text=False,
27-
# )
28-
# result = client.text_generation(history, **generate_kwargs)
29-
# result = result.strip()
30-
# # result = result.generated_text.strip() # if details=True
31-
# history += f" {result}</s> "
32-
# print(f'### Prompt ###\n{prompt}\n### Output ###\n{result}')
33-
# return {'responses':result, 'history':history}
34-
35-
def mistral_api(prompt, history, verbose=True, api_model="mistralai/Mistral-Nemo-Instruct-2407"):
6+
def mistral_api(prompt, history, verbose=True, return_dict=True, api_model="mistralai/Mistral-Nemo-Instruct-2407"):
367
"""
378
Example:
389
--------
3910
agent = Agent(toolchain = "responses, history = mistral_api(prompt, history)")
4011
agent('Write a neurology ICU admission note')
4112
"""
42-
# "mistralai/Mistral-Nemo-Instruct-2407" "mistralai/Mistral-7B-Instruct-v0.3"
4313
history = '<s>' if history is None else history
4414
history += f"[INST] {prompt} [/INST]"
4515
client = InferenceClient(api_model, token = os.environ.get('HF_READ_TOKEN', False))
4616
generate_kwargs = dict(
4717
temperature=0.9,
48-
max_new_tokens=1024,
18+
max_new_tokens=8192,
4919
top_p=0.95,
5020
repetition_penalty=1.0,
5121
do_sample=True,
5222
seed=42,
5323
stream=False,
5424
details=False,
55-
# details=True,
5625
return_full_text=False,
5726
)
5827
result = client.text_generation(history, **generate_kwargs)
5928
result = result.strip()
60-
# result = result.generated_text.strip() # if details=True
6129
history += f" {result}</s> "
6230
if verbose:
6331
print(f'### Prompt ###\n{prompt}\n### Output ###\n{result}')
64-
return {'responses':result, 'history':history}
32+
if return_dict:
33+
return {'responses':result, 'history':history}
34+
return result
6535

6636
def bark_api(prompt):
6737
"""

assets/ACB.pdf

-1 Bytes
Binary file not shown.

assets/agent_toolchain.pdf

-3 Bytes
Binary file not shown.

assets/dialektik.pdf

4.16 KB
Binary file not shown.

assets/dialektik.py

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
from pathlib import Path
2+
from datasets import load_dataset, concatenate_datasets
3+
import random
4+
import json
5+
import os
6+
from datetime import datetime
7+
from huggingface_hub import InferenceClient
8+
import phi_3_vision_mlx as pv
9+
import mlx.core as mx
10+
from functools import partial
11+
import fire
12+
13+
PATH_DS = 'JosefAlbers/StampyAI-alignment-research-dataset'
14+
PROMPT_THESIS = "Based on the above bullet points, create a detailed and engaging article that explores the main themes and insights. For each bullet point, provide context, elaborate on the key ideas, and discuss their implications. Ensure the article flows logically, connects related concepts, and presents a coherent narrative."
15+
PROMPT_ANTITHESIS = "Read through the article and write a response that challenges its main ideas. Offer different viewpoints, suggest alternative explanations, and propose new approaches. Keep your response well-structured and relevant to the original content."
16+
PROMPT_SYNTHESIS = """You have an initial article and a response to it:
17+
18+
**Article:**
19+
{thesis}
20+
21+
**Response:**
22+
{antithesis}
23+
24+
Create an improved version of the article that incorporates insights from both the original and the response. Address conflicting ideas and present a more comprehensive view. Add new insights based on this broader perspective. Your final article should be clear, balanced, and offer a deeper understanding of the topic."""
25+
26+
def setup(instruction="\n<|end|>\n<|user|>\nTLDR: Summarize the following text into concise, stand-alone bullet points (max 3-5 bullet points). Each bullet point should be self-contained and provide a clear and complete idea without referencing other bullet points or the original text.", list_source=['agentmodels', 'distill', 'arbital', 'blogs', 'lesswrong', 'youtube', 'arxiv', 'special_docs'], quantize_model=False, batch_size=4, path_ds=PATH_DS):
27+
model, processor = pv.load(blind_model=True, quantize_model=quantize_model, quantize_cache=False, use_adapter=False)
28+
def aggregate(example):
29+
str_md = f"# {example['title']}\n\n{example['text']}"
30+
example['str_md'] = str_md
31+
example['len_md'] = processor(str_md)['input_ids'].size
32+
return example
33+
def summarize(example):
34+
markdowns = example['str_md']
35+
prompts = [f'{m}{instruction}' for m in markdowns]
36+
summaries = pv.generate(prompts, preload=(model, processor), stream=False, verbose=False, max_tokens=512)
37+
example['sum_md'] = summaries
38+
return example
39+
list_ds = []
40+
try:
41+
_ds_prev = load_dataset(path_ds, token=os.getenv("HF_WRITE_TOKEN"), split='train')
42+
list_source = [i for i in list_source if i not in _ds_prev['source']]
43+
list_ds.append(_ds_prev)
44+
except:
45+
print('Dataset not found.')
46+
for src in list_source:
47+
ds = load_dataset('StampyAI/alignment-research-dataset', src, trust_remote_code=True, split='train')
48+
ds = ds.select_columns(['id', 'source', 'title', 'text', 'url', 'date_published', 'authors', 'summary', 'source_type'])
49+
ds = ds.map(aggregate)
50+
ds = ds.filter(lambda example: 600 < example["len_md"] < 6000)
51+
if batch_size > 1:
52+
ds = ds.sort('len_md')
53+
ds = ds.map(summarize, batched=True, batch_size=batch_size)
54+
ds = ds.filter(lambda example: ('<unk>' not in example['sum_md']) and ('<|end|>' in example['sum_md']))
55+
list_ds.append(ds)
56+
ds = concatenate_datasets(list_ds)
57+
ds.push_to_hub(path_ds, token=os.getenv("HF_WRITE_TOKEN"), private=True)
58+
59+
def load_books(list_source=None, list_exclude=None, path_ds=PATH_DS):
60+
ds = load_dataset(path_ds, token=os.getenv("HF_READ_TOKEN", None), split='train')
61+
if list_source:
62+
list_source = [list_source] if isinstance(list_source, str) else list_source
63+
ds = ds.filter(lambda example: example['source'] in list_source)
64+
if list_exclude:
65+
list_exclude = [list_exclude] if isinstance(list_exclude, str) else list_exclude
66+
ds = ds.filter(lambda example: not any(word in example['sum_md'] for word in list_exclude))
67+
print(f"Loaded {len(ds)} from {', '.join(set(ds['source']))}")
68+
books = ds['sum_md']
69+
books = [i.split('\n- ') for i in books]
70+
clean_str = lambda s: s[2:] if s.startswith('- ') else s[:-7] if s.endswith('<|end|>') else s
71+
books = [[clean_str(s).strip() for s in book] for book in books]
72+
return books
73+
74+
def pick_books(topic, list_idx, list_books, num_book=3):
75+
if topic is None:
76+
return random.sample(range(len(list_books)), num_book)
77+
list_rand = list_idx if list_idx else random.sample(range(len(list_books)), 100)
78+
list_text = [list_books[i][0] for i in list_rand]
79+
embed = pv.GteModel()
80+
l = embed(list_text)
81+
q = embed(topic)
82+
scores = mx.matmul(q, l.T)
83+
list_idx = mx.argsort(scores)[:,:-1-num_book:-1].tolist()
84+
list_idx = list_idx[0]
85+
return [list_rand[i] for i in list_idx]
86+
87+
def get_bullets(topic='AI agents', list_source=None, list_exclude=['MIRI', 'Machine Intelligence Research Institute'], list_idx=None, num_book=3, per_book=3):
88+
books = load_books(list_source, list_exclude)
89+
list_idx = pick_books(topic, list_idx, books, num_book)
90+
print(f"Picked {list_idx}")
91+
picks = [books[i] for i in list_idx]
92+
bullets = ''
93+
for pick in picks:
94+
pick=pick[:per_book]
95+
bullets += '- ' + '\n - '.join(pick) + '\n'
96+
bullets = bullets.strip()
97+
print(f'Bullets:\n{bullets}')
98+
return bullets, list_idx
99+
100+
def save_output(output, file_suffix=None, base_folder='syntheses'):
101+
file_suffix = f'_{file_suffix}' if file_suffix else ''
102+
os.makedirs(base_folder, exist_ok=True)
103+
date_str = datetime.now().strftime('%Y-%m-%d-%H-%M-%S')
104+
filename = os.path.join(base_folder, f'{date_str}{file_suffix}.md')
105+
with open(filename, 'w') as f:
106+
f.write(output)
107+
108+
def synthesize(topic=None, prompt_thesis=PROMPT_THESIS, prompt_antithesis=PROMPT_ANTITHESIS, prompt_synthesis=PROMPT_SYNTHESIS,
109+
list_source=None, list_exclude=['MIRI', 'Machine Intelligence Research Institute'],
110+
list_idx=None, num_book=3, per_book=3, llm_model=None):
111+
if llm_model is None:
112+
preload = pv.load(blind_model=True, quantize_model=True)
113+
generate = partial(pv.generate, preload=preload)
114+
else:
115+
generate = partial(pv.mistral_api, api_model=llm_model, history=None, return_dict=False, verbose=False)
116+
bullets, list_idx = get_bullets(topic, list_source, list_exclude, list_idx, num_book, per_book)
117+
prompt = f"{bullets}\n\n{prompt_thesis}"
118+
thesis_output = generate(prompt)
119+
prompt_anti = f'{thesis_output}\n\n{prompt_antithesis}'
120+
antithesis_output = generate(prompt_anti)
121+
prompt_synth = prompt_synthesis.format(thesis=thesis_output, antithesis=antithesis_output)
122+
synthesis_output = generate(prompt_synth)
123+
all_output = f'Thesis:\n---\n\n{thesis_output}\n\nAntithesis:\n---\n\n{antithesis_output}\n\nSynthesis:\n---\n\n{synthesis_output}\n\nArguments:\n---\n\ndialektik.synthesize({list_source=}, {list_exclude=},{list_idx=}, {per_book=}, {llm_model=})\n\n{bullets}'
124+
save_output(all_output)
125+
return thesis_output, antithesis_output, synthesis_output
126+
127+
if __name__ == "__main__":
128+
fire.Fire(synthesize)

assets/mlx_porting_guide.pdf

138 Bytes
Binary file not shown.
File renamed without changes.

assets/tutorial_0.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ We'll start by comparing the original Hugging Face implementation with our MLX p
4646

4747
### 2. Implementing SuRoPE for 128K Context
4848

49-
We'll explore the Surrogate Rotary Position Embedding (SuRoPE) implementation that enables Phi-3-Vision to handle impressive 128K token contexts.
49+
We'll explore the Su-scaled Rotary Position Embedding (SuRoPE) implementation that enables Phi-3-Vision to handle impressive 128K token contexts.
5050

5151
### 3. Optimizing Text Generation in MLX: From Batching to Advanced Techniques
5252

assets/tutorial_6.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ It's especially useful in the context of AI agents and function calling. Constra
103103

104104
In multi-agent systems, constrained decoding maintains consistent interfaces between components, allowing outputs from one model to serve reliably as inputs for another. This consistency is key for building robust, multi-step AI workflows and seamlessly integrating AI-generated code into larger systems.
105105

106-
## 2. Guided Reasoning in Complex Decision-Making
106+
### 2. Guided Reasoning in Complex Decision-Making
107107

108108
Constrained decoding can also guide the model's reasoning process in complex scenarios like medical diagnosis. Let's look at an example:
109109

@@ -154,4 +154,4 @@ This method of constrained decoding is analogous to asking a student to "show th
154154

155155
By implementing constrained decoding in complex decision-making scenarios, we can create more reliable and interpretable AI systems. This is important in high-stakes domains like medical diagnosis, legal reasoning, or financial analysis, where understanding the reasoning behind a decision is as important as the decision itself.
156156

157-
In the next part of our series, we'll explore techniques for fine-tuning our model on custom datasets, allowing us to adapt Phi-3-Vision for specific tasks or domains.
157+
In the next part of our series, we'll explore techniques for fine-tuning our model on custom datasets, allowing us to adapt Phi-3-Vision for specific tasks or domains.

0 commit comments

Comments
 (0)