Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for UC volumes in DABs #1762

Open
wants to merge 45 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
f772ce4
first comment
shreyas-goenka Sep 9, 2024
8f4f3ae
Merge remote-tracking branch 'origin' into feature/uc-volumes
shreyas-goenka Sep 9, 2024
ce5792c
fix convertor and add unit test
shreyas-goenka Sep 9, 2024
7c7abef
run as support
shreyas-goenka Sep 9, 2024
d04b6b0
-
shreyas-goenka Sep 9, 2024
9b66cd5
add apply target mode prefix functionality
shreyas-goenka Sep 9, 2024
4b22e2d
add conversion tests
shreyas-goenka Sep 9, 2024
88d0402
add inteprolation for volumes fields
shreyas-goenka Sep 9, 2024
6f9817e
add prompt and crud test for volumes
shreyas-goenka Sep 10, 2024
d47b0d6
Merge remote-tracking branch 'origin' into feature/uc-volumes
shreyas-goenka Sep 10, 2024
d180bab
add filer
shreyas-goenka Sep 15, 2024
fa54577
Merge remote-tracking branch 'origin' into feature/uc-volumes
shreyas-goenka Sep 15, 2024
73826ac
fix test and improve error
shreyas-goenka Sep 15, 2024
de7eb94
-
shreyas-goenka Sep 15, 2024
f10038a
-
shreyas-goenka Sep 15, 2024
df3bbad
add integration tests for artifacts portion:
shreyas-goenka Sep 15, 2024
aeab4ef
unit test for comparision of locatoin
shreyas-goenka Sep 15, 2024
aa2e16d
cleanup and add test
shreyas-goenka Sep 16, 2024
a90eb57
fix unit tests
shreyas-goenka Sep 16, 2024
39cb5e8
fix test on windows
shreyas-goenka Sep 16, 2024
13748f1
cleanup todos
shreyas-goenka Sep 16, 2024
bdecd08
-
shreyas-goenka Sep 16, 2024
274fd63
iter
shreyas-goenka Sep 16, 2024
d3d5d4c
-
shreyas-goenka Sep 16, 2024
227dfe9
fixes
shreyas-goenka Sep 16, 2024
e43f566
Merge remote-tracking branch 'origin' into feature/uc-volumes
shreyas-goenka Oct 14, 2024
f919e94
typo fix
shreyas-goenka Oct 14, 2024
9921263
separate GetFilerForLibraries tests
shreyas-goenka Oct 14, 2024
266c26c
fix tesT
shreyas-goenka Oct 14, 2024
a9b8575
fmt and fix test
shreyas-goenka Oct 14, 2024
c5a02ef
split into filer files
shreyas-goenka Oct 15, 2024
eb94cd6
-
shreyas-goenka Oct 15, 2024
3e3ddfd
fix test
shreyas-goenka Oct 15, 2024
d241c2b
add integration test for grant on volume
shreyas-goenka Oct 15, 2024
6192835
add custom prefixing behaviour for volumes
shreyas-goenka Oct 15, 2024
701b178
fmt
shreyas-goenka Oct 15, 2024
810da66
-
shreyas-goenka Oct 16, 2024
1a961eb
-
shreyas-goenka Oct 16, 2024
8a2fe49
merge
shreyas-goenka Oct 29, 2024
49b2cf2
Merge remote-tracking branch 'origin' into feature/uc-volumes
shreyas-goenka Oct 31, 2024
e32ebd0
use IsVolumesPath
shreyas-goenka Oct 31, 2024
6b12234
better message
shreyas-goenka Oct 31, 2024
f9287e0
address comments
shreyas-goenka Oct 31, 2024
250d426
rename to volume
shreyas-goenka Oct 31, 2024
1218178
-
shreyas-goenka Oct 31, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 4 additions & 9 deletions bundle/artifacts/upload.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,18 +21,13 @@ func (m *cleanUp) Name() string {
}

func (m *cleanUp) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {
uploadPath, err := libraries.GetUploadBasePath(b)
if err != nil {
return diag.FromErr(err)
}

client, err := libraries.GetFilerForLibraries(b.WorkspaceClient(), uploadPath)
if err != nil {
return diag.FromErr(err)
client, uploadPath, diags := libraries.GetFilerForLibraries(ctx, b)
if diags.HasError() {
return diags
}

// We intentionally ignore the error because it is not critical to the deployment
err = client.Delete(ctx, ".", filer.DeleteRecursively)
err := client.Delete(ctx, ".", filer.DeleteRecursively)
if err != nil {
log.Errorf(ctx, "failed to delete %s: %v", uploadPath, err)
}
Expand Down
26 changes: 26 additions & 0 deletions bundle/config/mutator/apply_presets.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ import (
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/dyn"
"github.com/databricks/cli/libs/dyn/dynvar"
"github.com/databricks/cli/libs/log"
"github.com/databricks/cli/libs/textutil"
"github.com/databricks/databricks-sdk-go/service/catalog"
"github.com/databricks/databricks-sdk-go/service/jobs"
Expand Down Expand Up @@ -193,6 +195,30 @@ func (m *applyPresets) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnos
// the Databricks UI and via the SQL API.
}

// Apply the prefix to volumes
shreyas-goenka marked this conversation as resolved.
Show resolved Hide resolved
for _, v := range r.Volumes {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manually verified that this indeed works end to end.

if containsUserIdentity(v.CatalogName, b.Config.Workspace.CurrentUser) {
log.Debugf(ctx, "Skipping prefix for volume %s because catalog %s contains the current user's identity", v.Name, v.CatalogName)
continue
}
if containsUserIdentity(v.SchemaName, b.Config.Workspace.CurrentUser) {
log.Debugf(ctx, "Skipping prefix for volume %s because schema %s contains the current user's identity", v.Name, v.SchemaName)
continue
}
// We only have to check for ${resources.schemas...} references because any
// other valid reference (like ${var.foo}) would have been interpolated by this point.
if p, ok := dynvar.PureReferenceToPath(v.SchemaName); ok && p.HasPrefix(dyn.Path{dyn.Key("resources"), dyn.Key("schemas")}) {
log.Debugf(ctx, "Skipping prefix for volume %s because schema name %s is defined in the bundle and will be interpolated at runtime", v.Name, v.SchemaName)
continue

}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels magical and has the risk of the same issue we have with schemas today.

To avoid we could make prefixing opt-in through a preset, and recommend setting it in development mode unless there is signal that prefixing is already done. Then it will be a conscious decision at all times. We can discuss more but this should be in line with what we do for schemas (and later also the registered models).


v.Name = normalizePrefix(prefix) + v.Name

// HTTP API for volumes doesn't yet support tags. It's only supported in
// the Databricks UI and via the SQL API.
}

// Clusters: Prefix, Tags
for key, c := range r.Clusters {
if c.ClusterSpec == nil {
Expand Down
143 changes: 143 additions & 0 deletions bundle/config/mutator/apply_presets_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@ import (
"github.com/databricks/cli/bundle/config/mutator"
"github.com/databricks/cli/bundle/config/resources"
"github.com/databricks/databricks-sdk-go/service/catalog"
"github.com/databricks/databricks-sdk-go/service/iam"
"github.com/databricks/databricks-sdk-go/service/jobs"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)

Expand Down Expand Up @@ -125,6 +127,147 @@ func TestApplyPresetsPrefixForUcSchema(t *testing.T) {
}
}

func TestApplyPresentsReminderToAddSupportForSkippingPrefixes(t *testing.T) {
_, ok := config.SupportedResources()["catalogs"]
assert.False(t, ok,
`Since you are adding support for UC catalogs to DABs please ensure that
you add logic to skip applying presets.name_prefix for UC schemas, UC volumes and
any other resources that fall under a catalog in the UC hierarchy (like registered models).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to retroactively change the prefix behaviour for registered models as well to keep it consistent? No one has complained yet so it's probably fine to keep as is.

Once you do so feel free to remove this test.`)
}

func TestApplyPresetsPrefixForUcVolumes(t *testing.T) {
tests := []struct {
name string
prefix string
volume *resources.Volume
want string
}{
{
name: "add prefix to volume",
prefix: "[prefix]",
volume: &resources.Volume{
CreateVolumeRequestContent: &catalog.CreateVolumeRequestContent{
Name: "volume1",
CatalogName: "catalog1",
SchemaName: "schema1",
},
},
want: "prefix_volume1",
},
{
name: "add empty prefix to volume",
prefix: "",
volume: &resources.Volume{
CreateVolumeRequestContent: &catalog.CreateVolumeRequestContent{
Name: "volume1",
},
},
want: "volume1",
},
{
name: "skip prefix when catalog name contains user short name",
prefix: "[prefix]",
volume: &resources.Volume{
CreateVolumeRequestContent: &catalog.CreateVolumeRequestContent{
Name: "volume1",
CatalogName: "dev_john_wick_targets",
},
},
want: "volume1",
},
{
name: "skip prefix when catalog name contains user email",
prefix: "[prefix]",
volume: &resources.Volume{
CreateVolumeRequestContent: &catalog.CreateVolumeRequestContent{
Name: "volume1",
CatalogName: "[email protected]_targets",
},
},
want: "volume1",
},
{
name: "skip prefix when schema name contains user short name",
prefix: "[prefix]",
volume: &resources.Volume{
CreateVolumeRequestContent: &catalog.CreateVolumeRequestContent{
Name: "volume1",
SchemaName: "dev_john_wick_weapons",
},
},
want: "volume1",
},
{
name: "skip prefix when schema name contains user email",
prefix: "[prefix]",
volume: &resources.Volume{
CreateVolumeRequestContent: &catalog.CreateVolumeRequestContent{
Name: "volume1",
SchemaName: "[email protected]_targets",
},
},
want: "volume1",
},
{
name: "skip prefix when schema is defined in the bundle and will be interpolated at runtime",
prefix: "[prefix]",
volume: &resources.Volume{
CreateVolumeRequestContent: &catalog.CreateVolumeRequestContent{
Name: "volume1",
SchemaName: "${resources.schemas.schema1.name}",
},
},
want: "volume1",
},
{
name: "add prefix when schema is a reference without the resources.schemas prefix",
prefix: "[prefix]",
volume: &resources.Volume{
CreateVolumeRequestContent: &catalog.CreateVolumeRequestContent{
Name: "volume1",
SchemaName: "${foo.bar.baz}",
},
},
want: "prefix_volume1",
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
b := &bundle.Bundle{
Config: config.Root{
Resources: config.Resources{
Volumes: map[string]*resources.Volume{
"volume1": tt.volume,
},
},
Presets: config.Presets{
NamePrefix: tt.prefix,
},
Workspace: config.Workspace{
CurrentUser: &config.User{
ShortName: "john_wick",
User: &iam.User{
UserName: "[email protected]",
},
},
},
},
}

ctx := context.Background()
diag := bundle.Apply(ctx, b, mutator.ApplyPresets())

if diag.HasError() {
t.Fatalf("unexpected error: %v", diag)
}

require.Equal(t, tt.want, b.Config.Resources.Volumes["volume1"].Name)
})
}
}

func TestApplyPresetsTags(t *testing.T) {
tests := []struct {
name string
Expand Down
6 changes: 5 additions & 1 deletion bundle/config/mutator/process_target_mode.go
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,10 @@ func transformDevelopmentMode(ctx context.Context, b *bundle.Bundle) {
}
}

func containsUserIdentity(s string, u *config.User) bool {
return strings.Contains(s, u.ShortName) || strings.Contains(s, u.UserName)
}

func validateDevelopmentMode(b *bundle.Bundle) diag.Diagnostics {
var diags diag.Diagnostics
p := b.Config.Presets
Expand Down Expand Up @@ -92,7 +96,7 @@ func validateDevelopmentMode(b *bundle.Bundle) diag.Diagnostics {
diags = diags.Extend(diag.Errorf("%s must start with '~/' or contain the current username to ensure uniqueness when using 'mode: development'", path))
}
}
if p.NamePrefix != "" && !strings.Contains(p.NamePrefix, u.ShortName) && !strings.Contains(p.NamePrefix, u.UserName) {
if p.NamePrefix != "" && !containsUserIdentity(p.NamePrefix, u) {
// Resources such as pipelines require a unique name, e.g. '[dev steve] my_pipeline'.
// For this reason we require the name prefix to contain the current username;
// it's a pitfall for users if they don't include it and later find out that
Expand Down
7 changes: 7 additions & 0 deletions bundle/config/mutator/process_target_mode_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,9 @@ func mockBundle(mode config.Mode) *bundle.Bundle {
Schemas: map[string]*resources.Schema{
"schema1": {CreateSchema: &catalog.CreateSchema{Name: "schema1"}},
},
Volumes: map[string]*resources.Volume{
"volume1": {CreateVolumeRequestContent: &catalog.CreateVolumeRequestContent{Name: "volume1"}},
},
Clusters: map[string]*resources.Cluster{
"cluster1": {ClusterSpec: &compute.ClusterSpec{ClusterName: "cluster1", SparkVersion: "13.2.x", NumWorkers: 1}},
},
Expand Down Expand Up @@ -288,6 +291,8 @@ func TestProcessTargetModeDefault(t *testing.T) {
assert.Equal(t, "servingendpoint1", b.Config.Resources.ModelServingEndpoints["servingendpoint1"].Name)
assert.Equal(t, "registeredmodel1", b.Config.Resources.RegisteredModels["registeredmodel1"].Name)
assert.Equal(t, "qualityMonitor1", b.Config.Resources.QualityMonitors["qualityMonitor1"].TableName)
assert.Equal(t, "schema1", b.Config.Resources.Schemas["schema1"].Name)
assert.Equal(t, "volume1", b.Config.Resources.Volumes["volume1"].Name)
assert.Equal(t, "cluster1", b.Config.Resources.Clusters["cluster1"].ClusterName)
}

Expand Down Expand Up @@ -331,6 +336,8 @@ func TestProcessTargetModeProduction(t *testing.T) {
assert.Equal(t, "servingendpoint1", b.Config.Resources.ModelServingEndpoints["servingendpoint1"].Name)
assert.Equal(t, "registeredmodel1", b.Config.Resources.RegisteredModels["registeredmodel1"].Name)
assert.Equal(t, "qualityMonitor1", b.Config.Resources.QualityMonitors["qualityMonitor1"].TableName)
assert.Equal(t, "schema1", b.Config.Resources.Schemas["schema1"].Name)
assert.Equal(t, "volume1", b.Config.Resources.Volumes["volume1"].Name)
assert.Equal(t, "cluster1", b.Config.Resources.Clusters["cluster1"].ClusterName)
}

Expand Down
2 changes: 2 additions & 0 deletions bundle/config/mutator/run_as_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ func allResourceTypes(t *testing.T) []string {
"quality_monitors",
"registered_models",
"schemas",
"volumes",
},
resourceTypes,
)
Expand Down Expand Up @@ -140,6 +141,7 @@ func TestRunAsErrorForUnsupportedResources(t *testing.T) {
"registered_models",
"experiments",
"schemas",
"volumes",
}

base := config.Root{
Expand Down
2 changes: 2 additions & 0 deletions bundle/config/resources.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ type Resources struct {
RegisteredModels map[string]*resources.RegisteredModel `json:"registered_models,omitempty"`
QualityMonitors map[string]*resources.QualityMonitor `json:"quality_monitors,omitempty"`
Schemas map[string]*resources.Schema `json:"schemas,omitempty"`
Volumes map[string]*resources.Volume `json:"volumes,omitempty"`
Clusters map[string]*resources.Cluster `json:"clusters,omitempty"`
}

Expand Down Expand Up @@ -75,6 +76,7 @@ func SupportedResources() map[string]ResourceDescription {
"registered_models": {SingularName: "registered_model"},
"quality_monitors": {SingularName: "quality_monitor"},
"schemas": {SingularName: "schema"},
"volumes": {SingularName: "volume"},
"clusters": {SingularName: "cluster"},
}
}
27 changes: 27 additions & 0 deletions bundle/config/resources/volume.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
package resources

import (
"github.com/databricks/databricks-sdk-go/marshal"
"github.com/databricks/databricks-sdk-go/service/catalog"
)

type Volume struct {
// List of grants to apply on this volume.
Grants []Grant `json:"grants,omitempty"`

// Full name of the volume (catalog_name.schema_name.volume_name). This value is read from
// the terraform state after deployment succeeds.
ID string `json:"id,omitempty" bundle:"readonly"`

*catalog.CreateVolumeRequestContent
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notably OWNER is missing from the schema here. Terraform allows managing the owner, but it's not available in the create API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't do owner assignment elsewhere, can you clarify the significance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No real significance for DABs. I just wanted to call this out since it's a divergence from Terraform. In fact, this is nice because it clearly guarantees that the owner of the UC volume is the deploying user. We do not allow modifying the owner of any of our UC resources.


ModifiedStatus ModifiedStatus `json:"modified_status,omitempty" bundle:"internal"`
}

func (v *Volume) UnmarshalJSON(b []byte) error {
return marshal.Unmarshal(b, v)
}

func (v Volume) MarshalJSON() ([]byte, error) {
return marshal.Marshal(v)
}
15 changes: 15 additions & 0 deletions bundle/deploy/terraform/convert.go
Original file line number Diff line number Diff line change
Expand Up @@ -406,6 +406,16 @@ func TerraformToBundle(state *resourcesState, config *config.Root) error {
}
cur.ID = instance.Attributes.ID
config.Resources.Schemas[resource.Name] = cur
case "databricks_volume":
if config.Resources.Volumes == nil {
config.Resources.Volumes = make(map[string]*resources.Volume)
}
cur := config.Resources.Volumes[resource.Name]
if cur == nil {
cur = &resources.Volume{ModifiedStatus: resources.ModifiedStatusDeleted}
}
cur.ID = instance.Attributes.ID
config.Resources.Volumes[resource.Name] = cur
case "databricks_cluster":
if config.Resources.Clusters == nil {
config.Resources.Clusters = make(map[string]*resources.Cluster)
Expand Down Expand Up @@ -465,6 +475,11 @@ func TerraformToBundle(state *resourcesState, config *config.Root) error {
src.ModifiedStatus = resources.ModifiedStatusCreated
}
}
for _, src := range config.Resources.Volumes {
if src.ModifiedStatus == "" && src.ID == "" {
src.ModifiedStatus = resources.ModifiedStatusCreated
}
}
for _, src := range config.Resources.Clusters {
if src.ModifiedStatus == "" && src.ID == "" {
src.ModifiedStatus = resources.ModifiedStatusCreated
Expand Down
Loading
Loading