MDPN Complete Dataverse Pro Manager Tutorials – All Tabs

🎓 Complete Dataverse Pro Manager Tutorials – All Tabs

🎓 Complete Tutorials Library

Comprehensive Step-by-Step Guides for All Dataverse Pro Manager Tabs

✨ With Real Industry Examples from Healthcare, Retail, Finance, Manufacturing & More ✨

🐍 Python Tab – Complete Tutorial

Query, analyze, join tables across unlimited environments with pandas & matplotlib

STEP 1: Setup – Add Environments

1.1 Connect Your Environments

1. Go to ☰ menu → “🌐 Environments” 2. Click “Add New Environment” 3. Enter: – Name: “Production” – URL: https://yourorg.crm.dynamics.com 4. Click “Connect” → Complete device code auth 5. Click “Load All Tables” 6. Repeat for unlimited environments!

1.2 Verify in Python Tab

# List all connected environments list_environments() # Output shows: # 📋 Available Environments: # 1. Production # 2. Development # 3. UAT # 4. Sandbox

STEP 2: Query Tables

Basic Query

import pandas as pd # Query all accounts (up to 5000 records) df = query_dataverse(“Production”, “account”) print(f”Retrieved {len(df)} records with {len(df.columns)} columns”) print(df.head())

Query with Filters

# Get high-value accounts only df = query_dataverse(“Production”, “account”, select_columns=[“name”, “revenue”, “industrycode”], filter_query=”revenue gt 1000000″, top=100) print(f”High-value accounts: {len(df)}”)
Filter TypeOData Syntax
Greater thanrevenue gt 1000000
Equalsstatecode eq 0
Starts withstartswith(name, 'A')
Not nullrevenue ne null
Multiplerevenue gt 1000000 and statecode eq 0

STEP 3: Join Tables – Same Environment

Example: Accounts + Contacts

import pandas as pd # Query both tables accounts = query_dataverse(“Production”, “account”, select_columns=[“accountid”, “name”, “revenue”]) contacts = query_dataverse(“Production”, “contact”, select_columns=[“contactid”, “fullname”, “parentcustomerid”]) # Join them merged = pd.merge(accounts, contacts, left_on=’accountid’, right_on=’parentcustomerid’, how=’inner’) print(f”Merged: {len(merged)} rows”) # Analysis: Contacts per account result = merged.groupby(‘name’).size().reset_index(name=’contact_count’) result = result.sort_values(‘contact_count’, ascending=False) print(“Top 10 accounts by contacts:”) print(result.head(10)) # Save ingest_to_dataverse(result, “Production”, “contacts_per_account”)

STEP 4: Join Tables – Cross Environment

Example: Compare Prod vs Dev

import pandas as pd # Query from both environments prod = query_dataverse(“Production”, “account”, select_columns=[“name”, “revenue”]) dev = query_dataverse(“Development”, “account”, select_columns=[“name”, “revenue”]) # Compare comparison = pd.merge(prod, dev, on=’name’, how=’outer’, suffixes=(‘_prod’, ‘_dev’), indicator=True) # Find differences only_prod = comparison[comparison[‘_merge’] == ‘left_only’] only_dev = comparison[comparison[‘_merge’] == ‘right_only’] in_both = comparison[comparison[‘_merge’] == ‘both’] print(f”Only in Prod: {len(only_prod)}”) print(f”Only in Dev: {len(only_dev)}”) print(f”In both: {len(in_both)}”) # Save comparison ingest_to_dataverse(comparison, “Production”, “prod_vs_dev”)

STEP 5: Visualizations

Create Charts

import matplotlib.pyplot as plt # Get data df = query_dataverse(“Production”, “account”, select_columns=[“industrycode”, “revenue”], filter_query=”revenue ne null”) # Aggregate industry_revenue = df.groupby(‘industrycode’)[‘revenue’].sum() / 1000000 industry_revenue = industry_revenue.sort_values(ascending=False).head(10) # Plot plt.figure(figsize=(12, 6)) plt.bar(range(len(industry_revenue)), industry_revenue.values) plt.xticks(range(len(industry_revenue)), industry_revenue.index, rotation=45, ha=’right’) plt.title(‘Top 10 Industries by Revenue’, fontsize=16, fontweight=’bold’) plt.ylabel(‘Revenue ($ Millions)’) plt.grid(axis=’y’, alpha=0.3) plt.tight_layout() plt.show() # Auto-displays in browser!

STEP 6: Save Results

Ingest Analysis Results

import pandas as pd from datetime import datetime # Perform analysis df = query_dataverse(“Production”, “account”) summary = df.groupby(‘industrycode’).agg({ ‘revenue’: [‘sum’, ‘mean’, ‘count’] }).reset_index() # Add metadata summary[‘analysis_date’] = datetime.now().strftime(‘%Y-%m-%d’) # Save to staging table result = ingest_to_dataverse(summary, “Production”, “industry_analysis”) print(f”✅ Saved {result[‘success’]} records to {result[‘table’]}”)

🏭 INDUSTRY EXAMPLES

HEALTHCARE

Patient Readmission Analysis

import pandas as pd import matplotlib.pyplot as plt # Query patient admissions admissions = query_dataverse(“Healthcare_Prod”, “msemr_patient”, select_columns=[“msemr_patientid”, “msemr_name”, “msemr_birthdate”, “msemr_chronicconditions”]) visits = query_dataverse(“Healthcare_Prod”, “msemr_encounter”, select_columns=[“msemr_encounterid”, “msemr_patientvalue”, “msemr_admitdate”, “msemr_dischargedate”]) # Join patients and visits patient_visits = pd.merge(admissions, visits, left_on=’msemr_patientid’, right_on=’msemr_patientvalue’) # Convert dates patient_visits[‘msemr_admitdate’] = pd.to_datetime(patient_visits[‘msemr_admitdate’]) patient_visits[‘msemr_dischargedate’] = pd.to_datetime(patient_visits[‘msemr_dischargedate’]) # Calculate length of stay patient_visits[‘los_days’] = (patient_visits[‘msemr_dischargedate’] – patient_visits[‘msemr_admitdate’]).dt.days # Identify readmissions (within 30 days) patient_visits = patient_visits.sort_values([‘msemr_patientid’, ‘msemr_admitdate’]) patient_visits[‘next_admit’] = patient_visits.groupby(‘msemr_patientid’)[‘msemr_admitdate’].shift(-1) patient_visits[‘days_to_readmit’] = (patient_visits[‘next_admit’] – patient_visits[‘msemr_dischargedate’]).dt.days patient_visits[‘is_readmission’] = patient_visits[‘days_to_readmit’] <= 30 # Analysis readmit_rate = patient_visits['is_readmission'].sum() / len(patient_visits) * 100 print(f"30-day Readmission Rate: {readmit_rate:.2f}%") # Readmissions by chronic conditions chronic_readmit = patient_visits.groupby('msemr_chronicconditions')['is_readmission'].mean() * 100 chronic_readmit = chronic_readmit.sort_values(ascending=False) # Visualize plt.figure(figsize=(12, 6)) plt.barh(range(len(chronic_readmit)), chronic_readmit.values) plt.yticks(range(len(chronic_readmit)), chronic_readmit.index) plt.xlabel('Readmission Rate (%)') plt.title('Readmission Rate by Chronic Condition') plt.tight_layout() plt.show() # Save insights result = ingest_to_dataverse(patient_visits, "Healthcare_Prod", "readmission_analysis") print(f"✅ Analysis saved: {result['success']} records")
RETAIL

Customer Purchase Patterns & RFM Segmentation

import pandas as pd import matplotlib.pyplot as plt # Query customers and orders customers = query_dataverse(“Retail_Prod”, “account”, select_columns=[“accountid”, “name”, “emailaddress1”]) orders = query_dataverse(“Retail_Prod”, “salesorder”, select_columns=[“salesorderid”, “customerid”, “totalamount”, “ordernumber”, “createdon”], top=50000) # RFM Analysis current_date = pd.Timestamp.now() orders[‘createdon’] = pd.to_datetime(orders[‘createdon’]) rfm = orders.groupby(‘customerid’).agg({ ‘createdon’: lambda x: (current_date – x.max()).days, # Recency ‘salesorderid’: ‘count’, # Frequency ‘totalamount’: ‘sum’ # Monetary }).reset_index() rfm.columns = [‘customerid’, ‘recency_days’, ‘purchase_count’, ‘total_spent’] # Score RFM (1-5 scale) rfm[‘R_score’] = pd.qcut(rfm[‘recency_days’], 5, labels=[5,4,3,2,1]) rfm[‘F_score’] = pd.qcut(rfm[‘purchase_count’].rank(method=’first’), 5, labels=[1,2,3,4,5]) rfm[‘M_score’] = pd.qcut(rfm[‘total_spent’], 5, labels=[1,2,3,4,5]) rfm[‘RFM_score’] = (rfm[‘R_score’].astype(int) + rfm[‘F_score’].astype(int) + rfm[‘M_score’].astype(int)) # Segment customers def segment(score): if score >= 13: return ‘Champions’ elif score >= 10: return ‘Loyal’ elif score >= 7: return ‘Potential’ else: return ‘At Risk’ rfm[‘segment’] = rfm[‘RFM_score’].apply(segment) # Join with customer names rfm_customers = pd.merge(rfm, customers, left_on=’customerid’, right_on=’accountid’) # Segment summary segments = rfm.groupby(‘segment’).agg({ ‘customerid’: ‘count’, ‘total_spent’: ‘sum’ }).reset_index() segments.columns = [‘segment’, ‘customer_count’, ‘total_revenue’] print(“📊 Customer Segments:”) print(segments) # Visualize fig, axs = plt.subplots(1, 2, figsize=(14, 6)) axs[0].bar(segments[‘segment’], segments[‘customer_count’]) axs[0].set_title(‘Customers by Segment’) axs[0].set_ylabel(‘Count’) axs[0].tick_params(axis=’x’, rotation=45) axs[1].bar(segments[‘segment’], segments[‘total_revenue’] / 1000) axs[1].set_title(‘Revenue by Segment’) axs[1].set_ylabel(‘Revenue ($1000s)’) axs[1].tick_params(axis=’x’, rotation=45) plt.tight_layout() plt.show() # Save ingest_to_dataverse(rfm_customers, “Retail_Prod”, “customer_rfm_scores”) ingest_to_dataverse(segments, “Retail_Prod”, “customer_segments”) print(“✅ RFM Analysis Complete!”) print(“💡 Use these segments for:”) print(” – Targeted email campaigns”) print(” – Loyalty program tiers”) print(” – Win-back campaigns for ‘At Risk'”)
FINANCE

Loan Default Risk Analysis

import pandas as pd import numpy as np import matplotlib.pyplot as plt # Query loan applications loans = query_dataverse(“Finance_Prod”, “loan”, select_columns=[“loanid”, “customerid”, “amount”, “interestrate”, “term_months”, “status”, “credit_score”, “income”]) customers = query_dataverse(“Finance_Prod”, “contact”, select_columns=[“contactid”, “fullname”, “age”, “employment_status”, “debt_to_income”]) # Join loan_data = pd.merge(loans, customers, left_on=’customerid’, right_on=’contactid’) # Calculate risk metrics loan_data[‘loan_to_income’] = loan_data[‘amount’] / loan_data[‘income’] loan_data[‘monthly_payment’] = (loan_data[‘amount’] * (loan_data[‘interestrate’]/100/12) * (1 + loan_data[‘interestrate’]/100/12)**loan_data[‘term_months’]) / \ ((1 + loan_data[‘interestrate’]/100/12)**loan_data[‘term_months’] – 1) # Define default (status = ‘Defaulted’ or ‘Delinquent’) loan_data[‘is_default’] = loan_data[‘status’].isin([‘Defaulted’, ‘Delinquent’]) # Risk factors analysis print(“📊 Default Rate by Credit Score Range:”) loan_data[‘credit_range’] = pd.cut(loan_data[‘credit_score’], bins=[0, 600, 650, 700, 750, 850], labels=[‘<600', '600-650', '650-700', '700-750', '750+']) default_by_credit = loan_data.groupby('credit_range')['is_default'].agg(['sum', 'count']) default_by_credit['default_rate'] = default_by_credit['sum'] / default_by_credit['count'] * 100 print(default_by_credit) print("\n📊 Default Rate by Loan-to-Income Ratio:") loan_data['lti_range'] = pd.cut(loan_data['loan_to_income'], bins=[0, 2, 3, 4, 100], labels=['<2x', '2-3x', '3-4x', '>4x’]) default_by_lti = loan_data.groupby(‘lti_range’)[‘is_default’].agg([‘sum’, ‘count’]) default_by_lti[‘default_rate’] = default_by_lti[‘sum’] / default_by_lti[‘count’] * 100 print(default_by_lti) # Predict risk score (simple scoring model) def calculate_risk_score(row): score = 0 if row[‘credit_score’] < 600: score += 40 elif row['credit_score'] < 650: score += 30 elif row['credit_score'] < 700: score += 20 if row['loan_to_income'] > 4: score += 30 elif row[‘loan_to_income’] > 3: score += 20 elif row[‘loan_to_income’] > 2: score += 10 if row[‘debt_to_income’] > 0.5: score += 20 elif row[‘debt_to_income’] > 0.4: score += 10 return score loan_data[‘risk_score’] = loan_data.apply(calculate_risk_score, axis=1) def risk_category(score): if score >= 60: return ‘High Risk’ elif score >= 30: return ‘Medium Risk’ else: return ‘Low Risk’ loan_data[‘risk_category’] = loan_data[‘risk_score’].apply(risk_category) # Visualize fig, axs = plt.subplots(2, 2, figsize=(14, 10)) # Chart 1: Default rate by credit score axs[0,0].bar(range(len(default_by_credit)), default_by_credit[‘default_rate’]) axs[0,0].set_xticks(range(len(default_by_credit))) axs[0,0].set_xticklabels(default_by_credit.index) axs[0,0].set_title(‘Default Rate by Credit Score’) axs[0,0].set_ylabel(‘Default Rate (%)’) # Chart 2: Risk distribution risk_dist = loan_data[‘risk_category’].value_counts() axs[0,1].pie(risk_dist.values, labels=risk_dist.index, autopct=’%1.1f%%’) axs[0,1].set_title(‘Loan Portfolio by Risk Category’) # Chart 3: Loan amount by risk risk_amounts = loan_data.groupby(‘risk_category’)[‘amount’].sum() / 1000000 axs[1,0].bar(risk_amounts.index, risk_amounts.values) axs[1,0].set_title(‘Total Loan Amount by Risk’) axs[1,0].set_ylabel(‘Amount ($ Millions)’) # Chart 4: Risk score distribution axs[1,1].hist(loan_data[‘risk_score’], bins=20, edgecolor=’black’) axs[1,1].set_title(‘Risk Score Distribution’) axs[1,1].set_xlabel(‘Risk Score’) axs[1,1].set_ylabel(‘Number of Loans’) plt.tight_layout() plt.show() # Save risk analysis result = ingest_to_dataverse(loan_data, “Finance_Prod”, “loan_risk_analysis”) print(f”\n✅ Risk analysis saved: {result[‘success’]} records”) print(“\n💡 High Risk Loans require:”) print(” – Additional collateral”) print(” – Co-signer requirement”) print(” – Higher interest rates”) print(” – Enhanced monitoring”)
MANUFACTURING

Production Quality & Defect Analysis

import pandas as pd import matplotlib.pyplot as plt # Query production data production = query_dataverse(“Manufacturing_Prod”, “production_run”, select_columns=[“runid”, “product_id”, “line_id”, “quantity_produced”, “quantity_defective”, “run_date”, “shift”, “operator_id”]) products = query_dataverse(“Manufacturing_Prod”, “product”, select_columns=[“productid”, “name”, “category”, “target_yield”]) # Join prod_data = pd.merge(production, products, left_on=’product_id’, right_on=’productid’) # Calculate quality metrics prod_data[‘defect_rate’] = (prod_data[‘quantity_defective’] / prod_data[‘quantity_produced’] * 100) prod_data[‘yield_rate’] = ((prod_data[‘quantity_produced’] – prod_data[‘quantity_defective’]) / prod_data[‘quantity_produced’] * 100) prod_data[‘meets_target’] = prod_data[‘yield_rate’] >= prod_data[‘target_yield’] # Analysis by product product_quality = prod_data.groupby(‘name’).agg({ ‘quantity_produced’: ‘sum’, ‘quantity_defective’: ‘sum’, ‘defect_rate’: ‘mean’, ‘yield_rate’: ‘mean’ }).reset_index() product_quality = product_quality.sort_values(‘defect_rate’, ascending=False) print(“📊 Top 10 Products by Defect Rate:”) print(product_quality.head(10)) # Analysis by production line line_quality = prod_data.groupby(‘line_id’).agg({ ‘quantity_produced’: ‘sum’, ‘quantity_defective’: ‘sum’, ‘defect_rate’: ‘mean’ }).reset_index() line_quality = line_quality.sort_values(‘defect_rate’, ascending=False) print(“\n📊 Production Line Quality:”) print(line_quality) # Analysis by shift shift_quality = prod_data.groupby(‘shift’).agg({ ‘defect_rate’: ‘mean’, ‘yield_rate’: ‘mean’, ‘quantity_produced’: ‘sum’ }).reset_index() print(“\n📊 Quality by Shift:”) print(shift_quality) # Trend analysis prod_data[‘run_date’] = pd.to_datetime(prod_data[‘run_date’]) prod_data[‘week’] = prod_data[‘run_date’].dt.isocalendar().week weekly_quality = prod_data.groupby(‘week’).agg({ ‘defect_rate’: ‘mean’, ‘yield_rate’: ‘mean’ }).reset_index() # Visualize fig, axs = plt.subplots(2, 2, figsize=(16, 10)) # Chart 1: Top 10 defect rates by product top10 = product_quality.head(10) axs[0,0].barh(range(len(top10)), top10[‘defect_rate’]) axs[0,0].set_yticks(range(len(top10))) axs[0,0].set_yticklabels(top10[‘name’]) axs[0,0].set_xlabel(‘Defect Rate (%)’) axs[0,0].set_title(‘Top 10 Products by Defect Rate’) # Chart 2: Line comparison axs[0,1].bar(line_quality[‘line_id’], line_quality[‘defect_rate’]) axs[0,1].set_xlabel(‘Production Line’) axs[0,1].set_ylabel(‘Defect Rate (%)’) axs[0,1].set_title(‘Defect Rate by Production Line’) axs[0,1].axhline(y=prod_data[‘defect_rate’].mean(), color=’r’, linestyle=’–‘, label=’Average’) axs[0,1].legend() # Chart 3: Shift comparison axs[1,0].bar(shift_quality[‘shift’], shift_quality[‘defect_rate’]) axs[1,0].set_xlabel(‘Shift’) axs[1,0].set_ylabel(‘Defect Rate (%)’) axs[1,0].set_title(‘Quality by Shift’) # Chart 4: Weekly trend axs[1,1].plot(weekly_quality[‘week’], weekly_quality[‘defect_rate’], marker=’o’, linewidth=2, label=’Defect Rate’) axs[1,1].plot(weekly_quality[‘week’], 100 – weekly_quality[‘yield_rate’], marker=’s’, linewidth=2, label=’Yield Loss’) axs[1,1].set_xlabel(‘Week Number’) axs[1,1].set_ylabel(‘Rate (%)’) axs[1,1].set_title(‘Quality Trend Over Time’) axs[1,1].legend() axs[1,1].grid(alpha=0.3) plt.tight_layout() plt.show() # Identify problem areas high_defect_products = product_quality[product_quality[‘defect_rate’] > 5] high_defect_lines = line_quality[line_quality[‘defect_rate’] > 3] print(f”\n🚨 QUALITY ALERTS:”) print(f” {len(high_defect_products)} products exceed 5% defect rate”) print(f” {len(high_defect_lines)} production lines exceed 3% defect rate”) # Save analysis result1 = ingest_to_dataverse(product_quality, “Manufacturing_Prod”, “product_quality_analysis”) result2 = ingest_to_dataverse(line_quality, “Manufacturing_Prod”, “line_quality_analysis”) print(f”\n✅ Quality analysis saved!”) print(“\n💡 Recommended Actions:”) print(” – Inspect high-defect production lines”) print(” – Additional training for underperforming shifts”) print(” – Quality control process review for problem products”)
EDUCATION

Student Performance & Intervention Analysis

import pandas as pd import matplotlib.pyplot as plt # Query student data students = query_dataverse(“Education_Prod”, “student”, select_columns=[“studentid”, “name”, “grade_level”, “attendance_rate”, “free_lunch_eligible”]) grades = query_dataverse(“Education_Prod”, “grade”, select_columns=[“gradeid”, “studentid”, “course”, “term”, “grade_percentage”, “credits”]) assessments = query_dataverse(“Education_Prod”, “assessment”, select_columns=[“assessmentid”, “studentid”, “test_name”, “score”, “assessment_date”]) # Join student data with grades student_grades = pd.merge(students, grades, on=’studentid’) # Calculate GPA def letter_to_gpa(percentage): if percentage >= 90: return 4.0 elif percentage >= 80: return 3.0 elif percentage >= 70: return 2.0 elif percentage >= 60: return 1.0 else: return 0.0 student_grades[‘gpa_points’] = student_grades[‘grade_percentage’].apply(letter_to_gpa) # GPA by student gpa_summary = student_grades.groupby([‘studentid’, ‘name’, ‘grade_level’, ‘attendance_rate’, ‘free_lunch_eligible’]).agg({ ‘gpa_points’: ‘mean’, ‘grade_percentage’: ‘mean’, ‘credits’: ‘sum’ }).reset_index() gpa_summary.columns = [‘studentid’, ‘name’, ‘grade_level’, ‘attendance_rate’, ‘free_lunch_eligible’, ‘gpa’, ‘avg_grade’, ‘total_credits’] # Identify at-risk students def risk_category(row): risk_factors = 0 if row[‘gpa’] < 2.0: risk_factors += 3 elif row['gpa'] < 2.5: risk_factors += 2 if row['attendance_rate'] < 85: risk_factors += 2 elif row['attendance_rate'] < 90: risk_factors += 1 if risk_factors >= 4: return ‘High Risk’ elif risk_factors >= 2: return ‘Medium Risk’ else: return ‘On Track’ gpa_summary[‘risk_category’] = gpa_summary.apply(risk_category, axis=1) # Analysis print(“📊 Student Risk Distribution:”) print(gpa_summary[‘risk_category’].value_counts()) print(“\n📊 Average GPA by Risk Category:”) print(gpa_summary.groupby(‘risk_category’)[‘gpa’].mean()) print(“\n📊 Correlation: Attendance vs GPA”) corr = gpa_summary[[‘attendance_rate’, ‘gpa’]].corr() print(corr) # Socioeconomic analysis print(“\n📊 Performance by Economic Status:”) econ_analysis = gpa_summary.groupby(‘free_lunch_eligible’).agg({ ‘gpa’: ‘mean’, ‘attendance_rate’: ‘mean’, ‘studentid’: ‘count’ }).reset_index() econ_analysis.columns = [‘free_lunch’, ‘avg_gpa’, ‘avg_attendance’, ‘student_count’] print(econ_analysis) # Visualize fig, axs = plt.subplots(2, 2, figsize=(14, 10)) # Chart 1: Risk distribution risk_counts = gpa_summary[‘risk_category’].value_counts() axs[0,0].bar(risk_counts.index, risk_counts.values) axs[0,0].set_title(‘Student Risk Distribution’) axs[0,0].set_ylabel(‘Number of Students’) axs[0,0].tick_params(axis=’x’, rotation=45) # Chart 2: Attendance vs GPA scatter axs[0,1].scatter(gpa_summary[‘attendance_rate’], gpa_summary[‘gpa’], alpha=0.5, s=50) axs[0,1].set_xlabel(‘Attendance Rate (%)’) axs[0,1].set_ylabel(‘GPA’) axs[0,1].set_title(‘Attendance Rate vs GPA’) axs[0,1].grid(alpha=0.3) # Chart 3: GPA distribution axs[1,0].hist(gpa_summary[‘gpa’], bins=20, edgecolor=’black’) axs[1,0].set_xlabel(‘GPA’) axs[1,0].set_ylabel(‘Number of Students’) axs[1,0].set_title(‘GPA Distribution’) axs[1,0].axvline(x=2.0, color=’r’, linestyle=’–‘, label=’At-Risk Threshold’) axs[1,0].legend() # Chart 4: Performance by economic status axs[1,1].bar([‘Free Lunch’, ‘No Free Lunch’], [econ_analysis[econ_analysis[‘free_lunch’]==True][‘avg_gpa’].values[0], econ_analysis[econ_analysis[‘free_lunch’]==False][‘avg_gpa’].values[0]]) axs[1,1].set_ylabel(‘Average GPA’) axs[1,1].set_title(‘Academic Performance by Economic Status’) plt.tight_layout() plt.show() # Intervention recommendations high_risk = gpa_summary[gpa_summary[‘risk_category’] == ‘High Risk’] print(f”\n🚨 INTERVENTION NEEDED:”) print(f” {len(high_risk)} students identified as high risk”) print(f” Average GPA: {high_risk[‘gpa’].mean():.2f}”) print(f” Average Attendance: {high_risk[‘attendance_rate’].mean():.1f}%”) # Save analysis result1 = ingest_to_dataverse(gpa_summary, “Education_Prod”, “student_risk_analysis”) result2 = ingest_to_dataverse(high_risk, “Education_Prod”, “intervention_list”) print(f”\n✅ Analysis saved!”) print(“\n💡 Recommended Interventions:”) print(” – Tutoring programs for low GPA students”) print(” – Attendance monitoring and family outreach”) print(” – Additional support for economically disadvantaged students”) print(” – Early warning system for declining performance”)

📦 Bulk Data Tab – Complete Tutorial

Mass data ingestion with wizard mode, lookups, and multi-environment support

Manual/File Ingestion

Simple Upload & Ingest

For straightforward data without relationships:

STEP 1: Prepare JSON file ━━━━━━━━━━━━━━━━━━━━━━ [ { “name”: “Contoso Ltd”, “revenue”: 5000000, “industrycode”: “Manufacturing” }, { “name”: “Fabrikam Inc”, “revenue”: 3500000, “industrycode”: “Technology” } ] STEP 2: In Bulk Data tab ━━━━━━━━━━━━━━━━━━━━━━ 1. Select “Manual/File Ingestion” sub-tab 2. Choose Environment: “Production” 3. Choose Table: “account” 4. Upload JSON file OR paste JSON 5. Click “Start Bulk Ingestion” 6. Monitor progress (100 records per batch) RESULT: ✅ Records created in Dataverse 📊 Success/error counts displayed

Wizard Mode (with Lookups)

Create Records with Parent-Child Relationships

SCENARIO: Import contacts linked to accounts ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ STEP 1: Prepare data with lookup keys { “contacts”: [ { “firstname”: “John”, “lastname”: “Doe”, “emailaddress1”: “john.doe@contoso.com”, “_account_key”: “Contoso Ltd” // Lookup key }, { “firstname”: “Jane”, “lastname”: “Smith”, “emailaddress1”: “jane@fabrikam.com”, “_account_key”: “Fabrikam Inc” } ] } STEP 2: Configure Wizard ━━━━━━━━━━━━━━━━━━━ 1. Go to “Wizard Mode (Lookups)” sub-tab 2. Select target table: “contact” 3. Click “Configure Lookup Mapping” STEP 3: Map Lookup Field ━━━━━━━━━━━━━━━━━━━ Lookup Field: parentcustomerid Related Table: account Match Key Field: name My Data Key: _account_key 4. Click “Add Mapping” STEP 4: Run Ingestion ━━━━━━━━━━━━━━━━━ 5. Upload/paste JSON 6. Click “Start Wizard Ingestion” RESULT: ✅ System automatically: – Queries accounts by name – Finds GUIDs – Creates @odata.bind references – Creates contacts with proper relationships

Save Configuration for Reuse

After configuring lookups: 1. Click “Save Configuration” 2. Name it: “Contact Import with Account Lookup” 3. Next time: Click “Load Configuration” → Select saved config 4. Upload new data → Run! 💡 Perfect for recurring imports!

Multi-Target Ingestion

Import to Multiple Tables/Environments Simultaneously

SCENARIO: Import customers to multiple regions ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ STEP 1: Create Import Groups ━━━━━━━━━━━━━━━━━━━━━━━ Group 1: Environment: “North_America” Table: “account” Data: [NA customers JSON] Group 2: Environment: “Europe” Table: “account” Data: [EU customers JSON] Group 3: Environment: “Asia” Table: “account” Data: [Asia customers JSON] STEP 2: Execute All Groups ━━━━━━━━━━━━━━━━━━━━━━━ Click “Run All Groups” RESULT: ✅ All regions updated simultaneously 📊 Individual success metrics per group

Direct @odata.bind

For Advanced Users – Pre-formatted Lookups

When you already have GUIDs: [ { “firstname”: “John”, “lastname”: “Doe”, “emailaddress1”: “john@example.com”, “parentcustomerid@odata.bind”: “/accounts(12345678-1234-1234-1234-123456789012)” } ] Use “Direct @odata.bind” sub-tab No lookup resolution needed – direct creation!

🏭 INDUSTRY EXAMPLES

HEALTHCARE

Import Patient Appointments with Doctor Lookups

DATA STRUCTURE: { “appointments”: [ { “msemr_appointmenttype”: “Consultation”, “msemr_scheduledstart”: “2025-02-01T10:00:00Z”, “msemr_scheduledend”: “2025-02-01T10:30:00Z”, “_patient_key”: “P12345”, // Patient lookup “_doctor_key”: “D789” // Doctor lookup } ] } WIZARD CONFIGURATION: ━━━━━━━━━━━━━━━━━━━━ Target Table: msemr_appointment Lookup 1: Field: msemr_patient Related Table: msemr_patient Match On: msemr_patientnumber Data Key: _patient_key Lookup 2: Field: msemr_practitioner Related Table: msemr_practitioner Match On: msemr_doctorcode Data Key: _doctor_key RESULT: ✅ 500 appointments imported with proper patient & doctor links ⏱️ 2 minutes total processing time
RETAIL

Import Product Catalog with Category Hierarchy

THREE-LEVEL HIERARCHY: Category → Subcategory → Product METHOD: Multi-Step Wizard ━━━━━━━━━━━━━━━━━━━━ STEP 1: Import Categories (no lookups) { “categories”: [ {“name”: “Electronics”, “code”: “ELEC”}, {“name”: “Clothing”, “code”: “CLTH”} ] } → Manual Ingestion to “category” table STEP 2: Import Subcategories (1 lookup) { “subcategories”: [ {“name”: “Laptops”, “_parent_cat”: “ELEC”}, {“name”: “Phones”, “_parent_cat”: “ELEC”} ] } → Wizard Mode with category lookup STEP 3: Import Products (2 lookups) { “products”: [ { “name”: “Dell XPS 15”, “price”: 1299.99, “_category”: “ELEC”, “_subcategory”: “Laptops” } ] } → Wizard Mode with category + subcategory lookups RESULT: ✅ Complete catalog with proper hierarchy 📦 1000 products imported 🔗 All relationships preserved
REAL ESTATE

Import Property Listings with Agent & Office Lookups

DATA: { “listings”: [ { “address”: “123 Main St”, “city”: “Seattle”, “price”: 750000, “bedrooms”: 3, “bathrooms”: 2, “_agent_email”: “agent@realty.com”, “_office_code”: “SEA001” } ] } WIZARD SETUP: ━━━━━━━━━━━━━ Lookup 1: Field: listing_agent Related: contact Match On: emailaddress1 Data Key: _agent_email Lookup 2: Field: listing_office Related: office Match On: office_code Data Key: _office_code RUN: Upload 2000 listings → Process in 5 minutes RESULT: ✅ All listings linked to correct agents & offices 📊 Success rate: 98% (40 failed due to invalid office codes)

🎯 Best Practices

  • Test first: Import 5-10 records before bulk
  • Save configs: Wizard configurations are reusable
  • Batch size: Automatic 100 records per batch
  • Error handling: Failed records are logged separately
  • Staging first: Import to staging tables for validation
  • Use keys wisely: Choose unique, stable fields for lookups

📋 Metadata Staging – Complete Tutorial

Extract & document 9 metadata types for compliance, auditing & analysis

9 Metadata Types Available

TypeWhat It CapturesUse Cases
📋 TablesEntity definitions, logical namesSchema documentation
📝 ColumnsAttributes, data types, max lengthData dictionary
🔗 Relationships1:N, N:1, N:N relationshipsERD generation
🎚️ Option SetsPicklist values, global/localReference documentation
📄 FormsForm layouts, sectionsUI documentation
👁️ ViewsSystem/personal views, columnsQuery documentation
📊 ChartsVisualization configsReporting catalog
📈 DashboardsDashboard layoutsAnalytics inventory
⚙️ Business RulesAutomation logicProcess documentation

STEP-BY-STEP: Extract Metadata

Basic Extraction

STEP 1: Select Environment ━━━━━━━━━━━━━━━━━━━━━━ Choose: “Production” STEP 2: Select Metadata Types (check boxes) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ☑ Tables ☑ Columns ☑ Relationships ☑ Option Sets ☐ Forms (uncheck if not needed) ☐ Views ☐ Charts ☐ Dashboards ☐ Business Rules STEP 3: Extract ━━━━━━━━━━━━━ Click “Extract Metadata” Wait 30-60 seconds STEP 4: Review ━━━━━━━━━━━━ Tables appear showing extracted data STEP 5: Stage to Dataverse ━━━━━━━━━━━━━━━━━━━━━━━ Click “Stage All to Dataverse” → Creates staging tables: – staging_metadata_tables – staging_metadata_columns – staging_metadata_relationships – staging_metadata_optionsets

Use Cases

COMPLIANCE

SOC2 Audit Documentation

REQUIREMENT: Document all data fields & relationships SOLUTION: 1. Extract Tables + Columns + Relationships 2. Stage to Dataverse 3. Export to Excel/PDF 4. Include in audit package RESULT: ✅ Complete data dictionary ✅ Entity relationship diagram data ✅ Timestamp proof of schema state ✅ Automated monthly updates
CHANGE MANAGEMENT

Dev vs Prod Schema Comparison

WORKFLOW: ━━━━━━━━━━━━━━━━━━━━━━━━━━ 1. Extract metadata from Development → Stage to staging_metadata_tables_DEV 2. Extract metadata from Production → Stage to staging_metadata_tables_PROD 3. Use Python Tab to compare: import pandas as pd dev = query_dataverse(“Production”, “staging_metadata_tables_DEV”) prod = query_dataverse(“Production”, “staging_metadata_tables_PROD”) # Find new tables in Dev new_tables = dev[~dev[‘logicalname’].isin(prod[‘logicalname’])] print(f”New tables in Dev: {len(new_tables)}”) print(new_tables[[‘displayname’, ‘logicalname’]]) # Find differences in columns dev_cols = query_dataverse(“Production”, “staging_metadata_columns_DEV”) prod_cols = query_dataverse(“Production”, “staging_metadata_columns_PROD”) comparison = pd.merge(dev_cols, prod_cols, on=’logicalname’, how=’outer’, suffixes=(‘_dev’, ‘_prod’), indicator=True) new_columns = comparison[comparison[‘_merge’] == ‘left_only’] print(f”\nNew columns in Dev: {len(new_columns)}”) RESULT: ✅ Complete schema diff report ✅ Identify breaking changes before deployment ✅ Document changes for release notes

🧹 Data Cleaning – Complete Tutorial

AI-powered data quality improvement before ingestion

Features

  • 🤖 AI Column Matching – Auto-map source to Dataverse fields
  • ✓ Validation – Email, phone, dates, numbers
  • 🔄 Deduplication – Find & remove duplicates
  • ✏️ Transformations – Uppercase, trim, patterns
  • ⚠️ Null Handling – Fill, remove, replace
  • 👁️ Preview – See changes before applying

Workflow

STEP 1: Upload Dirty Data ━━━━━━━━━━━━━━━━━━━━━━ CSV with: – Wrong column names – Invalid emails – Duplicate records – Null values – Inconsistent formats STEP 2: AI Column Matching ━━━━━━━━━━━━━━━━━━━━━━━ Select target Dataverse table: “contact” Click “AI Match Columns” → Automatically maps: “First Name” → firstname “Last Name” → lastname “Email” → emailaddress1 “Phone Number” → telephone1 STEP 3: Apply Transformations ━━━━━━━━━━━━━━━━━━━━━━━━ ☑ Validate emails ☑ Format phone numbers ☑ Remove duplicates (by email) ☑ Fill null countries with “USA” ☑ Uppercase last names STEP 4: Preview Changes ━━━━━━━━━━━━━━━━━━━━ See before/after comparison STEP 5: Export Clean Data ━━━━━━━━━━━━━━━━━━━━━━ Download or send to Bulk Data tab STEP 6: Ingest to Dataverse ━━━━━━━━━━━━━━━━━━━━━━━ Clean data → successful import!
LEGACY MIGRATION

Clean 50-Year-Old Customer Database

PROBLEM: – 50,000 records from mainframe – Column names: “CUST_NM_FN”, “CUST_NM_LN” – Inconsistent formatting – 30% null phone numbers – 15% duplicate emails SOLUTION (Data Cleaning Tab): 1. Upload CSV 2. AI maps cryptic columns to Dataverse 3. Apply transformations: – Email validation (remove 2,341 invalid) – Phone formatting (standardize all) – Deduplicate by email (remove 7,500 dupes) – Fill null phones with “(000) 000-0000” – Trim all text fields 4. Preview shows 40,159 clean records 5. Export → Bulk Data → Ingest RESULT: ✅ 40,159 valid records imported ❌ 9,841 records flagged for manual review ⏱️ 45 minutes vs 2 weeks manual cleaning

📊 Visualizations – Complete Tutorial

Create interactive charts directly from Dataverse data

6 Chart Types

Chart TypeBest ForExample
📊 Bar ChartComparisonsRevenue by industry
📈 Line ChartTrends over timeMonthly sales
🥧 Pie ChartPart-to-wholeMarket share
⚫ Scatter PlotCorrelationsSize vs revenue
📉 Area ChartVolume over timeCumulative sales
🍩 Donut ChartHierarchical proportionsCategory breakdown

Create a Chart

STEP 1: Select Data Source ━━━━━━━━━━━━━━━━━━━━━━ Option A: Query from Dataverse – Environment: Production – Table: account – Columns: industrycode, revenue – Filter: revenue gt 100000 Option B: Use existing data from another tab STEP 2: Configure Chart ━━━━━━━━━━━━━━━━━━━━ Chart Type: Bar Chart X-Axis: industrycode Y-Axis: revenue (sum) Title: “Revenue by Industry” Colors: Custom gradient STEP 3: Customize ━━━━━━━━━━━━━━━━ ✓ Show data labels ✓ Add legend ✓ Grid lines ✓ Custom colors STEP 4: Generate ━━━━━━━━━━━━━━ Click “Create Chart” → Interactive chart appears STEP 5: Export ━━━━━━━━━━━━ Download as PNG or PDF Or get embed code
EXECUTIVE DASHBOARD

Real-Time KPI Dashboard

5 CHARTS IN ONE VIEW: Chart 1: Monthly Revenue Trend (Line) Query: salesorder grouped by month Chart 2: Revenue by Region (Pie) Query: account revenue grouped by state Chart 3: Top 10 Customers (Bar) Query: account top 10 by revenue Chart 4: Win Rate Over Time (Area) Query: opportunity close rate by quarter Chart 5: Pipeline Value vs Time (Scatter) Query: opportunity value vs days in pipeline REFRESH: Click button to re-query and update all charts EXPORT: Save entire dashboard as PDF for board meeting

🔍 Query Tools – Complete Tutorial

5 different query interfaces for every use case

Query Tools Comparison

ToolBest ForDifficulty
📋 FetchXML/QueryPower users, XML queriesAdvanced
🧩 Multi-Table QueryJoin multiple tablesMedium
🧠 Universal QueryAny Web API pathAdvanced
📈 Visual QueryBeginners, drag-dropEasy
🔧 SQL QuerySQL expertsMedium

FetchXML Example

Paste in FetchXML/Query tab → Execute → View results

SQL Query Example

SELECT a.name, a.revenue, c.fullname FROM account a LEFT JOIN contact c ON a.accountid = c.parentcustomerid WHERE a.revenue > 1000000 ORDER BY a.revenue DESC System converts to Dataverse API calls automatically!

📊 Reports & Comparison – Complete Tutorial

Compare staging vs production or any two data sources

Use Cases

  • Staging vs Production validation
  • Dev vs UAT comparison
  • Pre vs post migration verification
  • Multi-environment sync check
WORKFLOW: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ STEP 1: Select Source 1 Environment: Production Table: staging_accounts STEP 2: Select Source 2 Environment: Production Table: accounts STEP 3: Select Key Field Field: name (must exist in both) STEP 4: Run Comparison Click “Compare” RESULTS: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ✅ In both: 8,432 records (matches) ⚠️ Only in staging: 156 records (new) ⚠️ Only in production: 23 records (missing from staging) ⚠️ Different values: 89 records (data changes) STEP 5: Export Results Download differences as CSV Or send to Bulk Data for correction

🔄 Data Deduplication – Complete Tutorial

Find and remove duplicate records before or after import

Matching Options

  • Exact match on single field (email)
  • Exact match on multiple fields (firstname + lastname)
  • Fuzzy matching (similar names)
  • Custom key combination
SCENARIO: Import 10,000 contacts, check for duplicates ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ STEP 1: Upload Data 10,000 contact records (CSV or JSON) STEP 2: Select Dataverse Table Table: contact Environment: Production STEP 3: Configure Deduplication Match Field: emailaddress1 Method: Exact match STEP 4: Run Check Click “Check for Duplicates” RESULTS: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 📊 Total records uploaded: 10,000 ✅ New records (not in Dataverse): 7,234 ❌ Duplicates (already in Dataverse): 2,766 STEP 5: Export Download “New Records Only” → 7,234 clean records → Import these to avoid duplicates!
CRM CLEANUP

Merge 3 Legacy Databases

CHALLENGE: Combining 3 acquired companies’ customer lists DATABASE A: 25,000 customers DATABASE B: 18,000 customers DATABASE C: 12,000 customers TOTAL: 55,000 records DEDUPLICATION PROCESS: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1. Upload Database A → 25,000 records imported 2. Dedupe Database B against Dataverse Result: 12,456 new, 5,544 duplicates Import: 12,456 new records 3. Dedupe Database C against Dataverse Result: 7,891 new, 4,109 duplicates Import: 7,891 new records FINAL RESULT: ✅ 45,347 unique customers (vs 55,000 raw) ❌ 9,653 duplicates prevented 💾 Saved: Hours of manual deduping

🔍 CRUD Operations – Complete Tutorial

Single-record Create, Read, Update, Delete operations

When to Use

  • Quick single record lookups
  • Manual data fixes
  • Testing API calls
  • One-off updates

CREATE – Add New Record

STEP 1: Select Environment & Table Environment: Production Table: account STEP 2: Enter JSON Payload { “name”: “New Customer Corp”, “revenue”: 2500000, “industrycode”: “Technology”, “emailaddress1”: “info@newcustomer.com” } STEP 3: Click “Create” RESULT: ✅ Record created 📋 New GUID returned: 12345678-1234-1234-1234-123456789012

READ – Retrieve Record

STEP 1: Enter Record ID GUID: 12345678-1234-1234-1234-123456789012 STEP 2: Click “Read” RESULT: { “accountid”: “12345678-1234-1234-1234-123456789012”, “name”: “New Customer Corp”, “revenue”: 2500000, “industrycode”: “Technology”, “createdon”: “2025-01-26T15:30:00Z” }

UPDATE – Modify Record

STEP 1: Enter Record ID GUID: 12345678-1234-1234-1234-123456789012 STEP 2: Enter Update JSON (only fields to change) { “revenue”: 3000000, “websiteurl”: “https://newcustomer.com” } STEP 3: Click “Update” RESULT: ✅ Record updated 📝 Fields modified: 2

DELETE – Remove Record

STEP 1: Enter Record ID GUID: 12345678-1234-1234-1234-123456789012 STEP 2: Click “Delete” STEP 3: Confirm deletion RESULT: ✅ Record deleted ⚠️ This action cannot be undone!

🗑️ Bulk Delete – Complete Tutorial

Delete large volumes of records across multiple tables/environments

⚠️ DANGER ZONE

Bulk Delete is PERMANENT and IRREVERSIBLE!

Always:

  • ✅ Test in non-production first
  • ✅ Export data before deletion
  • ✅ Verify filters carefully
  • ✅ Start with small batches

Deletion Methods

Method 1: Delete by IDs

STEP 1: Create Delete Group Name: “Remove Test Accounts” Environment: Development Table: account STEP 2: Upload IDs (JSON or CSV) [ “12345678-1234-1234-1234-123456789012”, “abcdef12-abcd-abcd-abcd-abcdef123456” ] STEP 3: Review Count 2 records will be deleted STEP 4: Execute ✅ 2 records deleted

Method 2: Delete by Filter

STEP 1: Configure Filter Field: name Operator: startswith Value: “TEST_” This will delete all accounts where name starts with “TEST_” STEP 2: Preview System shows: 47 records match this filter STEP 3: Confirm & Execute ✅ 47 test records deleted

Method 3: Delete ALL Records (DANGEROUS!)

⚠️ USE WITH EXTREME CAUTION STEP 1: Check “Delete All Records in Table” STEP 2: Type “DELETE ALL” to confirm STEP 3: Execute This deletes EVERY record in the table! Only use for: – Clearing test data – Resetting development environments – Removing temporary staging tables
DATA CLEANUP

Remove Invalid Test Data After Migration

SITUATION: After migration, discovered 2,341 test records SOLUTION: Bulk Delete with Filter ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Delete Group 1: Test Accounts Table: account Filter: name contains “ZZZTEST” Result: 847 accounts deleted Delete Group 2: Test Contacts Table: contact Filter: lastname = “TestUser” Result: 1,234 contacts deleted Delete Group 3: Test Orders Table: salesorder Filter: ordernumber startswith “TEST” Result: 260 orders deleted TOTAL: 2,341 test records removed TIME: 5 minutes vs hours of manual deletion

Shopping Cart