[EDP][UI] Allow relative or absolute paths in the local hdfs
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Sahara |
Fix Released
|
High
|
Chad Roberts |
Bug Description
Sahara currently demands that the URL for an hdfs data source begins with the hdfs scheme "hdfs://". The UI enforces this restriction as well.
When the hdfs scheme is specified, hadoop requires the hostname and/or port to be specified as well. For use of an external hdfs, of course, the host/port are always necessary.
On a long running cluster, the local hdfs can be used by specifying the host/port of the local namenode in the path so that data can be read/written to the cluster hdfs, for example "hdfs:/
For example (assuming the hadoop user), the relative path "output_path" will evaluate to
hdfs:
and the absolute path "/output_path" will evaluate to
hdfs:
Sahara should relax the restriction that hdfs URLs start with "hdfs://", and allow absolute/relative paths to access the local hdfs.
summary: |
- [EDP] Allow relative or absolute paths in the local hdfs + [EDP][UI] Allow relative or absolute paths in the local hdfs |
Changed in sahara: | |
status: | New → Confirmed |
assignee: | nobody → Chad Roberts (croberts) |
importance: | Undecided → High |
milestone: | none → juno-1 |
Changed in sahara: | |
status: | Confirmed → In Progress |
status: | In Progress → Fix Committed |
Changed in sahara: | |
status: | Fix Committed → Fix Released |
Changed in sahara: | |
milestone: | juno-1 → 2014.2 |
The patch for this in the sahara api is simple. Do not require the hdfs scheme, but if a scheme is present enforce hdfs and hostname. The UI will need changes to allow paths without scheme (but the change can be tested with the CLI)
diff --git a/sahara/ service/ validations/ edp/data_ source. py b/sahara/ service/ validations/ edp/data_ source. py service/ validations/ edp/data_ source. py service/ validations/ edp/data_ source. py hdfs_data_ source_ create( data): tion("HDFS url must not be empty") urlparse( data['url' ]) tion("URL scheme must be 'hdfs'") tion("HDFS url is incorrect, " tion("URL scheme must be 'hdfs'") tion("HDFS url is incorrect, "
index d2c8072..89d393c 100644
--- a/sahara/
+++ b/sahara/
@@ -69,8 +69,9 @@ def _check_
if len(data['url']) == 0:
raise ex.InvalidExcep
url = urlparse.
- if url.scheme != "hdfs":
- raise ex.InvalidExcep
- if not url.hostname:
- raise ex.InvalidExcep
- "cannot determine a hostname")
+ if url.scheme:
+ if url.scheme != "hdfs":
+ raise ex.InvalidExcep
+ if not url.hostname:
+ raise ex.InvalidExcep
+ "cannot determine a hostname")