Commit Graph

787 Commits

Author SHA1 Message Date
Richie 20a204612f added data dir for traning 2026-04-12 10:08:23 -04:00
Richie 27b609052c updated spell check 2026-04-12 10:08:23 -04:00
Richie 20fb24e244 added storage pool 2026-04-12 10:08:23 -04:00
Richie 230ab1d7f6 added tiktoken 2026-04-12 10:08:23 -04:00
Richie 9ffaa1b755 added summarization_prompts.py to sore the prompts 2026-04-12 10:08:23 -04:00
Richie c6b4ed4814 added tools dir for on off scripts i used 2026-04-12 10:08:23 -04:00
Richie 88ceeb55a1 added batch_bill_summarizer.py
batch bill  summarizer sends a batch api call to gpt
2026-04-12 10:08:23 -04:00
Richie 6c57d74644 decreased root_pool/models snapshot life 2026-04-12 10:08:23 -04:00
Richie cb98090f95 added bill_token_compression.py
tested on sample size of 100 bills matching the distribution of our data
Compression saves ~11.5% on prompt tokens; completion/reasoning are roughly equal across the two sets.
prompt	completion	reasoning	total
compressed	349,460	157,110	112,128	506,570
uncompressed	394,948	154,710	110,080	549,658
delta	−45,488	+2,400	+2,048	−43,088
2026-04-12 10:08:23 -04:00
Richie 63cb48a3dd created main prompt bench 2026-04-12 10:08:23 -04:00
Richie 6f6d247d3e fixed sunshine.nix 2026-04-12 10:08:23 -04:00
Richie 6b63315579 converting bob to a server 2026-04-12 10:08:23 -04:00
Richie a093c72eb9 creating prompt_bench downloader 2026-04-12 10:08:23 -04:00
Richie 67622c0e51 setting up hedgedoc 2026-04-11 11:42:08 -04:00
Richie d2f447a1af disabling kafka 2026-04-11 11:11:21 -04:00
Richie af365fce9a setup sunshine.nix 2026-04-03 17:12:24 -04:00
Richie 6430049e92 updated postgres snapshot settings 2026-03-30 14:07:08 -04:00
Richie 26e4620f8f fixed systemd sandboxing 2026-03-30 14:07:08 -04:00
Richie 93fc700fa2 removed preStart step 2026-03-30 14:07:08 -04:00
Richie 8d1c1fc628 added mountpoint= to postgres zfs create 2026-03-30 14:07:08 -04:00
Richie dda318753b improving postgres wal 2026-03-30 14:07:08 -04:00
Richie 261ff139f7 removed ds table from richie DB 2026-03-29 15:54:54 -04:00
Richie ba8ff35109 updated ingest_congress to use congress-legislators for legislator info 2026-03-29 15:54:54 -04:00
Richie e368402eea adding LegislatorSocialMedia 2026-03-29 15:54:54 -04:00
Richie dd9329d218 fixed tests 2026-03-29 15:54:54 -04:00
Richie 89f6627bed converted session.execute(select to session.scalars(select 2026-03-29 15:54:54 -04:00
Richie c5babf8bad ran treefmt 2026-03-29 15:54:54 -04:00
Richie dae38ffd9b added ingest_congress.py 2026-03-29 15:54:54 -04:00
Richie ca62cc36a7 adding congress data to new DS DB 2026-03-29 15:54:54 -04:00
Richie 035410f39e adding nemotron-3-nano 2026-03-29 15:54:54 -04:00
Richie e40ab757ca making more generic exception handling 2026-03-29 15:54:54 -04:00
Richie 345ba94a59 ran ingest_posts 2026-03-29 15:54:54 -04:00
Richie f2084206b6 adding tables for 2023 2026-03-29 15:54:54 -04:00
Richie 50e764146a added ingest_posts.py 2026-03-29 15:54:54 -04:00
Richie ea97b5eb19 adding 2026 partitions 2026-03-29 15:54:54 -04:00
Richie 1ef2512daa adding post table 2026-03-29 15:54:54 -04:00
Richie f9a9e5395c added media/temp for fast dir when working with data 2026-03-29 15:54:54 -04:00
Richie d8e166a340 adding data_science_dev 2026-03-29 15:54:54 -04:00
Richie c266ba79f4 updated snapshot_config.toml 2026-03-29 14:12:06 -04:00
Richie f627a5ac6e enabling kafka 2026-03-26 09:59:31 -04:00
Richie a5e7d97213 adding full qwen3 2026-03-24 16:20:21 -04:00
Richie 1419deb3c6 setting up brain nix serve 2026-03-24 15:04:48 -04:00
Richie 1f06692696 adding zstd to firefix settings 2026-03-24 12:53:44 -04:00
Richie 8f8177f36e adding zstd compression to fastapi 2026-03-24 12:53:44 -04:00
Richie 8534edc285 added git key binds 2026-03-24 12:45:51 -04:00
Richie 73b28a855b fixed missed renames 2026-03-24 12:45:51 -04:00
Richie 0c0810a06b added cycle status 2026-03-24 12:45:51 -04:00
Richie 239bef975a adding availability status to HA 2026-03-24 12:45:51 -04:00
Richie 2577b791f7 removing antigravity 2026-03-24 12:41:28 -04:00
Richie b4d9562591 fixed treefmt 2026-03-22 19:07:23 -04:00