-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Roadmap to whats next #1
Comments
Hey @kdpsingh I know you're super busy, but real quick, I've been going on a bug finding mission trying to use increasingly complex queries to find bugs. I've been able to replicate nearly all of the answers on this site I found, and I now have a postgres parser that I have been testing here and it works (with ability to switch back and forth) but I was wondering if you might have an idea of place with more complex queries that I can use to further bug search and test. thank you! here as an example of one of the more involved queries ive been able to replicate.
|
This looks amazing!!! Congrats on all the progress. Will give this a look soon and see where we should go next. My sense is that all the ingredients are here to packagify this. |
Thank you! Sounds great. I'm open to any direction. This has just been fun to try building I'll work on getting docstrings / documentation sorted |
just an fyi so u hopefully don't duplicate work and can avoid the joy of docstrings, but I went ahead and got the docstrings actually running here ( i had to get rid of all the show_query examples and use collect for most examples because of hidden space characters i could not find) |
Thank you for moving this to TidierOrg - was just about to suggest this. I have an idea for how to use this alongside TidierData for now that doesn't require a big rewrite of TidierData. |
Of course. I'm curious what that will look like, because avoiding a rewrite of either would be nice. I tried a few things to no avail for getting them to dispatch nicely together. (Ie creating a third macro that auto detects and then dispatches the correct version. ) I know it's not ideal but I'd be open to having it be |
Here's what I would suggest. When we use TidierDB with TidierData, we shouldn't do Then, all of the TidierDB macros will be available as |
Also, can we rename |
Yes absolutely I've been wanting to rename that because it was too long. I love that import method. And then it would allow TidierData to be called as normal too? I think that's great |
Yes, TidierData would work as normal so could use them together. |
Awesome. I think one of the last things then is getting the docs to build. I have all the files in place, but I'm not sure how to enable it to build |
I haven't gotten the chance to play with it yet. Did you have to install SQLite and/or DuckDB to get the code to work? Or was installing the relevant Julia packages for those sufficient? Just thinking about what the issue may be. If you can't get it working, I'll look at it tonight or tomorrow. |
Simpler question: do the docs build locally? |
I think they build locally. I changed the default to duckdb instead of SQLite and export 2 functions for opening the db and connecting. (Using other databases means the user has to write in using I think this has something to do with the package set up (ie git actions and I didn't set up for documenter. ) because there are only 2 green checks not 3 like other packages when pushing the new commits |
Ok let me check on this - will get to it tonight or tomorrow at the latest. Once the docs build, we can get this on the way to the registry. |
Also I would go ahead and add the Arrow dependency to make it work with DuckDB. |
I'm also considering allowing TidierData to work directly with Arrow but that will require a bit of work to ensure that we don't fully instantiate Arrow tables as DataFrames. This way via DuckDB might be a nice alternative. |
Sweet. I'll get that added a little later tn and change the name to |
I realized The documenter.yml just needs to be added and then it should work. So I should be able to get both this and tidierfiles working tn |
There's one setting you'll have to change in Settings to make the documentation actually show up. If tests pass, happy to help. |
So i have reached a bit of an impasse. I have been unable to get the docs to build and continue getting this error error I have tried various things including making it so no code runs because everything is a comment in the docs, and no matter what i get this error. I am wondering if its something you have ever experienced?
|
I haven't experienced this exact error but let me investigate today. |
sounds good, the docs seem to run locally on my end so curious what the issue is.
|
I've had this error - it seems to happen when there is an example that does not output anything |
That's exactly what the issue was. Now fixed. |
This is a map as guide for what is to come and what others might want to contribute if they feel so inclined.
Macros
mutate
select
arrange
distinct
filter
slice_min
,slice_max
,@slice_sample
group_by
count
summarize
/summarise
left_join
- will try to sort out syntax differences from tidier.jl.right_join
inner_join
rename
window_order
window_frame
head
relocate
fill
pivot_longer
andpivot_wider
collect
show_query
Functions
expr_to_sql
to useMacrotools.postwalk
start_query_meta
- allows query to begin and stores metadata to enable tidy selectioncase_when
,if_else
starts_with
,ends_with
,contains
,str_detect
is_missing()
replace_missing
andmissing_if
as_integer
,as_float
,as_string
MAX
,MIN
,SUM
,AVG
,COUNT
asn()
insummarize
across
insummarize
across
inmutate
For Postgres/DuckDB specific parser
year
,month
,day
,hour
,min
,second
,floor_date
difftime
needed for postgres and mysqlstr_detect
,str_replace/_all
,str_remove/_all
corr
,cov
,std
,var
collect
/start_query_meta
for postgres/duckdbFeatures
so i fixed it again to avoid using eval, but it just requires a hidden context and an add function. it works, but its a few extra keystrokes for the user.
copy_to
added support for SQLite and DuckDBconnect()
For duckdb directly read in paths/ https without local intermediary
missing something? drop a comment/new issue
Backends currently supported:
DuckDB
SQLite
MySQL (MariaDB supported when using MySQL.jl too)
Postgres
MSSQL
Clickhouse
Google Big Query
AWS
Oracle
Snowflake
databricks
Up next:
Oracle via Oracle.jl? Vs ODBC.jl this discussion suggests it could work for both this and redshift. But need to test
Hive/Impala?
Druid? Druid.jl library hasnt changed much in years..but it should work?no current or intended maintenance per the author. Could be done thru rest api if interest is thereRedshift? it looks like libpq may enable redshift connecting so this could be pretty straightforward vs ODBC.jl which for either case would mean just adding a new parser
Spark would be nice .. but using Spark.jl requires folks to add things from terminal and may lead to build issues for the general user.. so im a bit apprehensive to add it as a dependency
The text was updated successfully, but these errors were encountered: